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PREFACE TO THE SECOND EDITION 


Although it was in print for a short time only, the original edition 
of Multiplicative Number Theory had a major impact on research 
and on young mathematicians. By giving a connected account of 
the large sieve and Bombieri’s theorem, Professor Davenport made 
accessible an important body of new discoveries. With this stimula- 
tion, such great progress was made that our current understanding 
of these topics extends well beyond what was known in 1966. As the 
main results can now be proved much more easily. I made the 
radical decision to rewrite §§23-29 completely for the second 
edition. In making these alterations I have tried to preserve the tone 
and spirit of the original. 

Rather than derive Bombieri’s theorem from a zero density 
estimate tor L tunctions, as Davenport did, I have chosen to present 
Vaughan’s elementary proof of Bombieri’s theorem. This approach 
depends on Vaughan’s simplified version of Vinogradov’s method 
for estimating sums over prime numbers (see §24). Vinogradov 
devised his method in order to estimate the sum )’,<x e(pa); to 
maintain the historical perspective I have inserted (in §§25, 26) 
a discussion of this exponential sum and its application to sums of 
primes, before turning to the large sieve and Bombieri’s theorem. 

Before Professor Davenport’s untimely death in 1969, several 
mathematicians had suggested small improvements which might be 
made in Multiplicative Number Theory, should it ever be reprinted. 
Most of these have been incorporated here; in particular, the nice 
refinements in §§12 and 14, were suggested by Professor E. Wirsing. 
Professor L. Schoenfeld detected the only significant error in the 
book, in the proof of Theorems 4 and 4A of §23. Indeed these 
theorems are false as they stood, although their corollaries, which 
were used later, are true. In considering the extent and nature of my 
revisions, I have benefited from the advice of Professors Baker, 
Bombieri, Cassels, Halberstam, Hooley, Mack, Schmidt, and 
Vaughan, although the responsibility for the decisions taken 1s 
entirely my own. The assistance throughout of Mrs. H. Davenport 
and Dr. J. H. Davenport has been invaluable. Finally, the - 
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mathematical community is indebted to Professor J.-P. Serre for 
urging Springer-Verlag to publish a new edition of this important 
book. 


H.L.M. 


PREFACE TO THE FIRST EDITION 


My principal object in these lectures was to give a connected 
account of analytic number theory in so far as it relates to problems 
of a multiplicative character, with particular attention to the distribu- 
tion of primes in arithmetic progressions. Most of the work is by 
now classical, and I have followed to a considerable extent the 
historical order of discovery. I have included some material which, 
though familiar to experts, cannot easily be found in the existing 
expositions. 

My secondary object was to prove, in the course of this account, 
all the results quoted from the literature in the recent paper of 
Bombieri! on the average distribution of primes in arithmetic 
progressions; and to end by giving an exposition of this work, 
which seems likely to play an important part in future researches. 
The choice of what was included in the main body of the lectures, 
and what was omitted, has been greatly influenced by this considera- 
tion. A short section has, however, been added, giving some ref- 
erences to other work. 

In revising the lectures for publication I have aimed at producing 
a readable account of the subject, even at the cost of occasionally 
omitting some details. I hope that it will be found useful as an 
introduction to other books and monographs on analytic number 
theory. 

§§23 and 29 contain recent joint work of Professor Halberstam 
and myself, and I am indebted to Professor Halberstam for per- 
mission to include this. The former gives our version of the basic 
principle of the large sieve method, and the latter is an average 
result on primes in arithmetic progressions which may prove to be 


‘ On the large sieve, Mathematika, 12, 201-225 (1965). 
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a useful supplement to Bombieri’s theorem. No account is given of 
other sieve methods, since these will form the theme of a later 
volume in this series by Professors Halberstam and Richert. ” 


H.D. 


* This book subsequently appeared as Sive Methods, Academic Press (London), 
1974. 
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NOTATION 


We write f(x) = O(g(x)), or equivalently f(x) < g(x), when there 
is a constant C such that | f(x)| < Cg(x) for all values of x under 
consideration. We write f(x) ~ g(x) when lim f(x)/g(x) = 1 as 
x tends to some limit, and f(x) = o(g(x)) when lim f(x)/g(x) = 0. 
Moreover, we say that f(x) = Q(g(x)) to indicate that 
lim sup | f(x)|/g(x) > 0, while f(x) = Q:(g(x)) means that 
lim sup f(x)/g(x) > 0 and lim inf f(x)/g(x) < 0. 

If & is a vector, then ||&|| denotes its norm, while, if 6 is a real 
number then |/@|| denotes the distance from @ to the nearest integer. 
In certain contexts (see p. 32), we let [x] denote the largest integer 
not exceeding the real number x, and we let (x) be the fractional part 
of x, (x) =x — [x]. Generally s denotes a complex variable, 
s=o+ it, while p = B + iy denotes the generic non-trivial zero 
of the zeta function or of a Dirichlet L function. When no confusion 
arises, we let y stand for Euler’s constant. 

The arithmetic functions d(n), A(n), n(n), and o(n) are defined 
as usual. Other symbols are defined on the following pages. 


a 71 S(T) 98 
B 80-82 S(N) 146 
B(x) 83 I'(s) 61, 73 
b(x) 116 C(s) 1 
c,(n) 148 C(s, a) 71 
E(x, q) 161 E(s) 62 
E*(x, q) 161 &(s, x) 71 
e(8), e,(0) ot n(x) 54 
h(d) 44 L* 160 
lix 54 T(x) 65 
WM, Wi(q, a), m 146 x(n) 29 
N(T) 59 W(x) 60 
NT, x) 101 W(x, x) 115 
N(a, T) 134 W(x, x) 162 


N(a, T, x) 133 


xiil 
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PRIMES IN ARITHMETIC 
PROGRESSION 


Analytic number theory may be said to begin with the work of 
Dirichlet, and in particular with Dirichlet’s memoir of 1837 on the 
existence of primes in a given arithmetic progression. 

Long before the time of Dirichlet it had been asserted that every 
arithmetic progression 


in which a and gq have no common factor, includes infinitely many 
primes. Legendre, who had based some of his demonstrations on 
this proposition, attempted to give a proof but failed. The first proof 
was that of Dirichlet in the memoir | have referred to (Dirichlet’s 
Werke, I, pp. 313~342), and strictly speaking this proof was complete 
only in the case when q is a prime. For the general case, Dirichlet 
had to assume his class number formula, which he proved in a 
paper of 1839-1840 (Werke, I, pp. 411-496). Dirichlet states at the 
end of the earlier paper that originally he had a different proof, by 
indirect and complicated arguments, of the vital result that was 
needed [the fact that L(1, y) # 0 for each real nonprincipal character 
x; see §4], but I do not think that there is any indication anywhere 
of its nature. 

I shall follow Dirichlet’s example in treating first the simpler case 
in which q is a prime. We can suppose that q > 2, for when q = 2 
the arithmetic progression contains all sufficiently large odd 
numbers, and the proposition is then a triviality. 

Dirichlet’s starting point, as he himself says, was Euler’s proof of 
the existence of infinitely many primes. If we write 


Cs) = Fon 


for areal variable s > 1, then Euler’s identity is 


(s)=[]d-p sy" 
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for s > 1, where p runs through all the primes; this identity is an 
analytic equivalent for the proposition that every natural number 
can be factorized into prime powers in one and only one way. It 
follows from the identity that 


log f(s) = y 2m m—'p — ms 


Since ((s) > co as s > | from the right, and since 


mop hme, y p es a l, 
2 


p m=2 p P(p 


18 


: 


it follows that 


m 


pire 
p 


as s > 1 from the right. This proves the existence of an infinity of 
primes, and proves further that the series Dp ', extended over the 
primes, diverges. Dirichlet’s aim was to prove the analogous state- 
ments when the primes p are limited to those which satisfy the 
condition p = a(mod q). 

To this end he introduced the arithmetic functions called 
Dirichiet’s characters. Each of these is a function of the integer 
variable n, which is periodic with period q and is also multiplicative 
(without any restriction). Moreover, these functions are such that a 
suitable linear combination of them will produce the function which 
is 1 if n = a(mod q) and 0 otherwise. 

The construction of these functions is based on the existence of a 
primitive root to the (prime) modulus q, or in other words on the 
cyclic structure of the residue classes modulo q under multiplication, 
when 0 is excluded. Let v(n) denote the index of n relative to a fixed 
primitive root g, that is, the exponent v for which g” = n. Let w be 
a real or complex number satisfying 


wt = 1. 
Then the typical Dirichlet character for the modulus q is 
ave, 
which is uniquely defined, since the value of v(n) is indeterminate 
only to the extent of the addition of a multiple of g — 1. The definition 
presupposes that n is not divisible by q, but it is convenient to 


complete the definition by taking the function to be 0 when n is 
divisible by g. There is one function for each choice of w, and different 
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choices of w give different functions; thus there are g — 1 such func- 
tions. Each is a periodic function of n with period q, and is multiplica- 
tive because, if 


n = n,n, (mod q), 
then 
v(n) = v(n,) + v(n)(mod q — 1). 


(We have supposed here that neither n, nor n, 1s divisible by q, but 
the multiplicative property is a triviality if either of them is.) 

We recall the well-known fact that £,,* has the value g — 1 if k 
is divisible by g — 1 and has the value 0 otherwise. Hence 


y Qu = 3 et ifn = a(mod q), 


@ 


0 otherwise, 


since v(n) = v(a)(modq — 1) if and only if n =a(modq). The 
expression on the left, after division by g — 1, is the linear combin- 
ation of the various functions w(n) that was referred to above; it 
serves to select from all integers n those that are congruent to the 
given number a to the modulus g. 
For each of the possible choices for w, Dirichlet introduced the 
function 
00 


Lo) = YY wns 


n=1 
n $0 (mod q) 


of the real variable s, for s > 1. Since the coefficient of n~* is a multi- 
plicative function of n, we have the analog of Euler’s identity: 


L,(s) = [T] @ — p77}, 


p#q 


for s > 1. A detailed proof is easily given, on the same lines as for 
Euler’s original identity, by considering first the finite product over 
p < Nand then making N > o. 

None of the factors on the right vanishes, since |w’”)p~ ‘| = 
p ° <4 for s > 1, and as the product is absolutely convergent it 
follows that L,(s) #0 for s > 1. Taking the logarithm of both 
sides, we get 

log L(s)= ¥ Ym twos. 
p#qm=1 
The logarithm on the left is, in principle, multivalued if w is complex, 
but the value which is provided by the series on the right is obviously 
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the natural one to use, since it is a continuous function of s for s > 1 
and tends to 0 as s > oo, corresponding to the fact that L,(s) > 1 
(1 being the first term in its defining series). 

Multiplying the last equation by w~”) and summing over all the 
values of w, we obtain 


1 ie. 6) 
(1) poe eee a ie ae 
. yea asda) 


The sum of all those terms on the right for which m > 1 is at most 1, 
since they are a subset of the terms considered earlier in connection 
with log ¢(s). Hence the right side of (1) is 


yp * + O(1). 


p=a(mod q) 


The essential idea of Dirichlet’s memoir is to prove that the left side 
of (1) tends to + 00 as s > 1. This will imply that there are infinitely 
many primes p = a(modq), and further that the series Xp7' 
extended over these primes is divergent. 

One of the terms in the sum on the left of (1) comes from w = 1, 
and is simply log L,(s). The function L,(s) is related in a simple way 
to C¢(s), for we have 


fe @) 


L,(s) = ) n= (1 — gq” Ye(s). 


n=1 


Hence L,(s) + +00 as s > 1 from the right, and therefore the same 
is true of log L,(s). Hence to complete the proof it will suffice to show 
that, for each choice of w other than 1, 


log L.,(s) 


is bounded as s > 1 from the right. 
At this point it clarifies the situation if we observe that, provided 
w # 1, the series which defines L,(s), namely 


00 


Ls)= YY wo™n’s, 
=} 
n# 6 (mod q) 
is convergent not only for s > 1 but for s > 0. It is, in fact, a series of 


the type covered by Dirichlet’s test for convergence, since (a) n $ 
decreases as n increases and has the limit 0, and (b) the sum of any 
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number of the coefficients w”” is bounded. The justification for 
(b) lies in the fact that w’” is periodic with period q, and 


q-1 q-2 
yamM= ¥ wo" =0, 
n=1 m=0 


since the index v(m) runs through a complete set of residues to the 
modulus g — 1. 

It follows further from Dirichlet’s test that the series is uniformly 
convergent with respect to s for s>6> 0, and consequently 
L,(s) is a continuous function of s for s > 0. So to prove that log L,,(s) 
is bounded as s > 1 from the right is equivalent to proving that 


(2) L.(1) # 0. 


Dirichlet’s proof of this takes entirely different forms according as 
w is real or complex. The only real value of w is —1, since w # 1 
now. 

Suppose first that w is complex. If we take a = 1, and so v(a) = 0, 
in (1), we get 


1 00 
—— Ji log L,(s)= ¥ ¥ mo 'po™. 
q — l @ p m=1 
p™ = 1 (mod q) 


Since the terms on the right (if there are any) are positive, it follows 
that 


¥ log L,(s) = 9, 


which implies that 


(3) [] L.(s) = 1. 


All this, of course, is for s > 1. ; 

If there is some complex w for which L,(1) = 0, then L;(1) = 0 
also, where @ denotes the complex conjugate of w. Thus two of the 
factors on the left of (3) will have the limit 0 as s > 1 from the right. 
One other factor, namely L,(s), has the limit + oc. Any other factors 
are certainly bounded, being continuous functions of s for s > 0. On 
examining in more detail the behavior as s — | of the three factors 
mentioned, we shall get a contradiction to (3), in that the two factors 
with limit 0 will more than cancel the one factor with limit + oo. 

As regards L,(s), we have 


L,(s) = (1 — q7Yh(s) < (1 — gq” *)K(s) 
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for 1 <s < 2, and 


(s)= Vins<t +| x dx = ——, 


n=1 


Hence 


for 1 < s < 2, where A is independent of s. 
As regards L,,(s), the supposition that L,(1) = 0 implies that, for 
s>l, 


L,(s) = L,fs) — L.(1) = (s — ILo (5), 


where s, is some number between | and s. The series for L;(s), 
namely 


Li(i)=—- > w™(logn)n-’, 
=1 


n= 
n# 0 (mod q) 


is convergent for s >0Q by Dririchlet’s test, since the function 
(log n)n~* decreases for sufficiently large n and has the limit 0. 
Moreover the convergence is again uniform for s > 6 > 0, so that 
Li,(s) 1s continuous for s > 0. In particular, |L/i,(s)| is bounded for 
s > 1, and therefore 


IL.(s)| < A,(s — 1) 


for s > 1, where A, is independent of s. Naturally the same in- 
equality holds with @ in place of w. 

On using these inequalities in (3), and making s — 1, we get the 
desired contradiction. 

The argument could have been expressed more briefly by using 
the elements of complex function theory. As we shall see later, 
L,(s) has a simple pole at s = 1, and the supposition that L,,(s) and 
L;(s) have zeros at s = 1 implies that the product on the left of (3) 
has a zero at s = 1, which contradicts the inequality. 

Suppose now that m = —1. The above argument is inapplicable, 
since the supposition that L_,(1) = 0 would produce only a single 
factor with a zero at s = 1. 

We now have 


om = (—1) a ") 
q 
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and the convention made earlier that w”” is to be replaced by 0 when 
n = 0(mod q) is in agreement with the usual convention for the 
Legendre symbol on the right. From now on, we abbreviate L _ ,(s) 
to L(s), since this will be the only function with which we shall be 
concerned. Thus we have 


ay A n°. 


The aim is to prove that L(1) # 0. We already know that L(1) > 0, 
from the continuity of L(s) and the fact that L(s) > Ofors > 1 (by the 
Euler product formula). It may be worth remarking that the need to 
prove that L(1) # 0 is almost inevitable in the approach we are 
using. If it were possible for L(1) to vanish, it would follow, on 
considering log L(1), that 


 (? p*>-o assl. 
p \q 


This would imply a great preponderance of primes in those residue 
classes a(mod q) for which a is a quadratic nonresidue, and this 
preponderance might (on the face of things) be such that Zp~' 
summed over the primes in the other residue classes was convergent. 

Dirichlet’s proof, in the case now under consideration, is based on 
a relationship (which goes back to the work of Gauss) between the 
quadratic character (n|q) and the complex exponential function 
e?""/4, which I shall abbreviate to e(n/q) or to e,(n). (Instead of 
speaking about the complex exponential function, we could speak 
about the qth roots of unity; but it is necessary for some purposes to 
be able to distinguish between one gth root of unity and another, 
and the complex exponential function offers the simplest way of 
doing this.) 

Let G(n) denote the so-called Gaussian sum, defined by 


(4) G(n) = bo Te, e,(mn). 


By changing the variable of summation from m to m’, where 
m = mn(mod q), we obtain the relation 


(5) G(n) = Ao = ("|o 
q q 


say. The argument presupposes that n # 0 (mod q), but the relation 
holds in the excluded case also, because then G(n) = 
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Assuming that G # 0 (an assumption that will be justified later), 
we have 
n 142" |m 
-}=5 —] e,(mn). 
(" G > ( , 
Substituting this in the series for L(1), we obtain 


q 


The sum of the inner series is easily deduced from that of the log- 
arithmic series. If |z| < 1 and z # 1, we have 


SE 
—log(1 —z)= ) -2’, 
n=1 n 
where the logarithm has its principal value. That means, in the 
present context, that arg(1 — z) lies between —4x and 32, since the 
real part of 1 — z is positive. Taking z = e'®, where 0 < 0 < 2z, we 
have arg(1 — z) = 3(@ — nz), as is easily seen either from a picture or 
by calculation. Also |1 — z| = 2 sin 46. Hence 
or tb. i : 
y, —e'"? = —log(2 sin $6) — (0 — niji. 
n 


n=1 


Putting 0 = 2mm/q and substituting in the formula for L(1), we get 


6) Liy= = ("| toe [sin =} +i i 


As we shall prove later, the value of G is q? if g = 1 (mod 4) and 
iq® if gq = 3 (mod 4). This compels one to distinguish two cases. 
Suppose q = 3 (mod 4). Since L(1) is real, we must have 


mm wn 
q 2 


(7) Lij=- > Ym 
q* m=1 


and in fact the vanishing of the other part of the sum is easily verified 
by taking together the terms m and q — m. The last formula gives an 
elegant expression for L(1) by a finite sum, the value of which is 
easily computed in any particular case. For example, if g = 23, we 
have 


q-1 

Y m(t] =1424344—$4+6-748+9-10-11 +12 
+ 13— 14-154 16 — 17+ 18 — 19—20—21 —22 

= —69, 
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and 
L(1) = 32/233. 


The finite sum occurring above is always an odd integer, for it has 
the same parity as 


q-1 
3 m = +4 7 1)q, 
m=} 


and both 3(q — 1) and q are odd. It therefore cannot vanish, and this 
gives the proof of the desired result that L(1) # 0 in the case now 
under consideration. It is a remarkable fact that no one has yet 
given a simple and direct proof that the value of the finite sum in (7) 
is negative, though we know that this must be so from the fact that 
L(s) > O for s > 1 and consequently L(1) > 0. 

Dirichlet gave another expression for L(1) as a finite sum, in addi- 
tion to (7), which is of great interest and which is more convenient 
for computation. By Euler’s product formula [or alternatively from 
the original definition of L(s)], we have 


wo [-E) SO 


Proceeding as before, with s = 1, and using the fact that 
1 ifO0<m < 3gq, 


e,(mn) = = 


Tt iffqg<m<q 


(which is easily deduced from the sum of the logarithmic series), we 


obtain 
1/2\])~! 1. fix m m 
ee eee eel all baa el 
T m 
~ [2 — (2/q)lq? 2, "). 


This shows that, for gq = 3 (mod 4), there are always more quadratic 
residues than nonresidues in the first half of the range from 0 to q. 
Again, no direct proof is known. 

Suppose q = 1 (mod 4). Then G = q?, and (6) gives 


(9) Li) = - +2 " "| log 2 sin 
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This can be written as 


nn 
q 


where 


_ []sin@N/q) sin(nN/q) 
~ TJ sin(2R/q) sin(nR/q)’ 


in which we use R to denote the typical quadratic residue (mod q) 
between 0 and g, and N to denote the typical quadratic nonresidue. 

To prove that L(1) # 0 is equivalent to proving that Q # 1. For 
this, Dirichlet had recourse to a result that had been proved by 
Gauss in his work on cyclotomy. This is that, for an indeterminate x, 


[] [x — e(R)] = 3 Y(x) — g?Z(x)] 


R 


and 


[] [x — e(N)] = a1 ¥(0) + g?Z0)), 

N 
where Y(x) and Z(x) are polynomials with integral coefficients. 
Assuming this, we have the identity 


4[ ¥2(x) — qZ?(x)] = in [x —e,(m)} = xt 1+ x07 4-41. 


Putting x = 1, we obtain integers Y = Y(1) and Z = Z(1) which 
satisfy the Diophantine equation 


Y? — gZ? = 4q. 


[Obviously Y must be divisible by gq, and the present argument 
provides a method for solving the “‘negative” Pellian equation 
qY{ — Z? = 4, when q isa prime congruent to 1 (mod 4).] We note 
that Z # 0, since 4q is not a perfect square. 

The quotient Q which occurred above is expressible in terms of 
Y and Z, as follows. We have 


[] U1 — eR] = [] e.(¢ R)(—2i sin 2R/q) 
= (214-2 e (> R) |] sin xR/q, 


and a similar relation with N in place of R. Now 


YR = EN =44q — D, 
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on grouping together the numbers R and R’ = q — R, and similarly 
with N. Hence 


I sin(xN/q) Y4 giz 


o- [[sin(zR/g) Y— qtzZ 
R 
Since Z # 0, we have the desired conclusion that Q # 1. 

This completes the proof of Dirichlet’s theorem for a prime 
modulus gq, subject to the proof of the value of Gauss’ sum and the 
proof of the result on cyclotomy which we have just used. These 
proofs will be given in the next two sections. 

I ought perhaps to add that Dirichlet derived the finite expression 
(6) for L(1) by a somewhat different method from that which we have 
used above. He started from the power series 


ee PC 


say, and by putting this in the formula 


1 
P(sn7* = [xt (log x7"! dx 
0 
he obtained 


T(s)L(s) = — | f i 
0 


jos x tye dx. 


On putting s = 1 and expressing the rational function in the inte- 
grand as a sum of partial fractions, and integrating, he obtained (6). 
The two methods are essentially equivalent, but the last formula 
written above has some independent interest in that it serves to 
define L(s) as a regular function of the complex variable s for all s. 


2 


GAUSS’ SUM 


In this section we evaluate the sum 
q-1 
m 
G= > "| e,(m), 
m=1 q 


where q is a prime other than 2. It is easy to prove that G? = q 
if q = 1 (mod 4) and G? = —gq if g = 3 (mod 4), though this does 
not determine G completely. The computation is as follows. We 
have 


G= 


q~1 q-1 mm 
1M 
| €,(m, + m2). 


mi=1m2=1 


On changing the variable of summation in the inner sum from m, 
to n, where m, = m,n(mod q), we get 
. 1 q-1 


G? = Y (“Jegim, + mn) 


p> 


Now we interchange the order of summation, and note that 


1 ifn = —1(mod q), 
1 otherwise. 


—1 q~1 
aad S| +5 (oo 


—1 q if gq = 1 (mod 4), 
~ WO} )—@ if =3(mod 4), 


q-1 


Y e{m(n + 1) -\" 


mi=1 


Hence 


as stated above. 
The sign of G was determined by Gauss only after many and 
varied unsuccessful attempts.’ Since then several proofs have been 


' See Gauss, Werke II, p. 156. 
12 


GAUSS’ SUM 13 


given, based on a variety of different methods.* As Gauss himself 
remarked, any proof of the exact value of G must take account of 
the particular ordering of the qth roots of unity, which is provided 
by the complex exponential function. If instead of G we consider 
the sum 


c™, 


5 (" 
m=1 q 
where ¢ is any qth root of unity (other than 1), its sign cannot be 
specified, for it follows from (5) of the preceding section that if ¢ 
is replaced by ¢", where n is a quadratic nonresidue (mod q), the 


sign of the sum gets reversed. The evaluation of G? given above 
would, however, apply equally well to the sum with an unspecified 


The method used by Dirichlet in 1835 (Werke I, pp. 237-256) to 
evaluate G is probably the most satisfactory of all that are known. 
It is based on Poisson’s summation formula, and it has the advan- 
tage that once the proof has been embarked upon, no special in- 
genuity is called for. 

It is first necessary to express the definition of G in a form which 
does not contain explicitly the symbol of quadratic character. 
With the same meaning for R and N as in §1, we have 


G = )eR) — )e(N) = 14+ 2) e,(R). . 
R N R 


This can be written equivalently as 


q-1 


G= ¥ efx’), 


x=0 
for x? assumes the value 0 once and assumes each value R twice. 
Dirichlet’s method actually evaluates the more general sum 
N-1 
S= yi eis: 
n=0 
where N is any positive integer, and the answer is that 


(1+iN? if N = 0(mod 4), 


eo Le if N = 1 (mod 4), 
10 if N = 2(mod 4), 
iN? if N = 3(mod 4). 


? See Landau, Vorlesungen I, pp. 157-171, and Estermann, J. London Math. Soc., 
20, 66-67 (1945). 


14 MULTIPLICATIVE NUMBER THEORY 


Here N? denotes the positive square root. It may be as well to add 
also that i denotes the same square root of —1 as occurs in e?7!”7/%, 

Poisson’s summation formula states that, under certain condi- 
tions on the function f(x), 


Ysroy= Y [floret dx, 


where &’ means that the end terms of the sum, corresponding to 
n = A and n = B, are to be replaced by 4f(A) and 4/(B). In the 
series on the right it may be necessary to take the terms v and —v 
together to ensure convergence, but actually it will not be necessary 
in the present application. Sufficient conditions for the validity of 
the formula are that f(x) is a real function which is continuous 
and monotonic in stretches. From the point of view of analysis 
these are severe restrictions, but they are quite adequate for most 
applications. 

Poisson’s summation formula, which is an extremely useful 
tool in analytic number theory, is easily deduced (under the above 
restrictions) from the basic theorem concerning Fourier series, 
which was first rigorously proved by Dirichlet himself in 1829 
(Werke I, pp. 117-132). Let f,(x) coincide with f(x) for 0 < x < 1 
and be defined elsewhere by periodicity with period 1; then f,(x) 
is continuous for 0 < x < 1 but has (in general) an ordinary dis- 
continuity at x = 0, its values on the left and on the right being 
f(1) and f(0) respectively. The Fourier series of f,(x) is 


3a) + ) (a,cos2nvx + b, sin 2nvx), 


v=1 


where the coefficients are given by Fourier’s formulas: 
1 . 1 
ta, =| f(x) cos 2nvx dx, 4b, =| f(x) sin 2nvx dx. 
0 0 


The theorem in question is that the above series converges for all x, 
and that its sum is /,(x) at a point of continuity of f,(x), and is the 
mean of the left and right values of f,(x) at a point of ordinary 
discontinuity. Thus, taking x = 0, we get 


4/0) + f(N)] = 409+ Ya= Y ff flxycos 2nvx dx. 
v=1 y= — 0 


This is the case A = 0, B = 1 of Poisson’s summation formula, 
and the general case follows on replacing f(x) by f(x + n), for 
n= A, A+1,..., B — 1, and adding the results. 
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For the application to Gauss’ sum, we take f(x) = cos 27x?/N 
and f(x) = sin 2nx?/N and combine the results. Thus 


n”n 
lI 


@ N 
oy [ e2tivx + 2nix2/N dx 
==? 


I 


a 1 
N ys i e2tiN(x? + vx) dx 
vy=—a°O 


agen 1+4v deiNy? 
=N Yen snine?| e2t!Ny” dy, 


where in the last step we have put x + 4v = y. The value of e~ #7!” 
is 1 if v is even, and is e~ #7" = i~% if y is odd. We therefore divide 
the sum over v into two parts, according as v is even or odd, and 
we put v = 2y or 2 — 1 as the case may be. This gives 


oO uti ae ae a0 utd ee 
S=N Y¥ | eX dy+NiN YY | e7INy” dy. 
poet u- 4 


u=— 


Each series of integrals fits together to give 


oO iN 2 
| er dy. 
—- 0 


This is a convergent integral, and it is a matter of indifference 
whether we construe it in the narrow sense, as 


Y 


lim , 
Y>or -Y 
or in the wider sense as 
; Zz 
lim 
Y,Z>~ aor -Y 


For if Y’ > Y > 0 we have 
Y’ iN od y? —1 2niNz 
| ePriNe? dy = 4] ZF ern ge. 
Y y2 


and by the second mean value theorem, or by integration by parts, 
this has absolute value O(Y~') as Y > oo. The convergence of the 
integral in the wider sense justifies our earlier remark that, in the 
present application of Poisson’s summation formula, it is not 
necessary to take together the terms v and —v. 
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Resuming the evaluation of S, we have 
S=NL+i-%[ e287 dy, 


and this implies, on putting y = N~#u, that 
S=(1 + i-%)CN?, 


where C is an absolute constant. This constant is most easily 
evaluated by putting N = 1, whereupon S =! and we get 
C =(1 +i ')"'. Hence 


1 ‘aa 
Sa 
1+i 


by 


and this gives the four values stated earlier, according to the residue 
class to which N belongs to the modulus 4. 


3 


CYCLOTOMY 


Cyclotomy is concerned with the properties of the roots of unity 
of a given order, with particular reference to their algebraic char- 
acter.' Our first object must be to establish the result quoted in §1, 
and this we can do without going very deeply into the theory. 
Afterward I shall digress briefly from the main theme of these 
lectures to discuss two topics in cyclotomy which are of general 
interest. 

We shall be concerned only with roots of unity of prime order. 
Let q be a prime other than 2, and let ¢ be a gth root of unity other 
than 1. Then the entire set of gth roots of unity other than 1 is 


(1) Cine yaar ss 


and the sum of these numbers is — 1. By using this relation, together 
with the relation (4 = 1, we can express any polynomial in € with 
integral coefficients in the form 


aC +al? +--+ 4,07’, 


where a,,...,a,-, are integers. Moreover the expression in this 
form is unique, since the cyclotomic polynomial 


xt} x82 $e $21, 


of which € is a zero, is irreducible over the rational field, and there- 
fore € cannot satisfy an equation of lower degree with integral 
coefficients. 

Let g be a primitive root to the modulus gq, and let v(n) denote the 
index of n relative to g. Asn assumes the values 1, 2,..., g — 1,itsindex 
v(n) assumes the same values in another order. 

Now consider any factorization of q — 1, say 


q—-—1=ef. 


' The standard references are Bachmann’s Kreisteilung of 1872 and Mathews’s 
Theory of Numbers of 1892. 
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The roots of unity ¢” enumerated in (1) can be subdivided according 
to the residue class to which v(n) belongs to the modulus e. There 
will be e such sets, each comprising f numbers. The sums of the 
various subsets are called the Gaussian periods of f terms, and 
are denoted by mo, 1)1,---. Ne— 1. Thus 


q-1 
ae d, oC 


n= 
v(n) = j (mod ¢) 


It is obviously not essential to restrict j to the range 0 <j <e; 
the last equation can be used for all j, and then 7; is periodic in j 
with period e. 

In particular, ife = 2and f = (q — 1), we get the two periods 


no =>, 6, MN => 0", 


where, as earlier, R and N run through the quadratic residues and 
nonresidues respectively, in the range from 1 to g — |. If we fix 
¢ = e?"/4, we can deduce the values of these two periods from 
the value of Gauss’ sum. For then 


49-1 Im 
No = Y f 


m=t 


where eis 1 oriaccordingasq = 1 or3(mod 4).Sincen, + 4, = —1, 
it follows that 
No = X(—1 + eq*), ny = (—1 —€9?). 
In the general case, if we choose ( = e**'4, the value of 7 is 

uniquely determined, and in fact 

q-1 

no =e ' ¥ e,(x*). 
x=1 


But the individual values of 7,,...,47,.~, will depend on the choice 
of the primitive root, and they may get permuted if this is replaced 
by another primitive root. 

Now let F(¢) be any polynomial in €, say 


q-1 
FQ) = AC, 
r=1 


and suppose F(¢) has the property that 
F(¢™) = F(¢) 


CYCLOTOMY 19 


whenever v(m) = 0 (mode). Then, by the uniqueness of repre- 
sentation of a polynomial, we have 


A, = A, 


whenever r = sm(mod q — 1), and this holds for all m with v(m) = 0 
(mod e). Hence A, depends only on the residue class (mod e) to 
which v(r) belongs. Grouping together the terms in the same residue 
class, we obtain 

F(¢) = Aim a oes a Ane. 


Thus F(¢) is a linear combination of the Gaussian periods. 

We have tacitly supposed that the coefficients A, are integers 
(or rational numbers), and it is only under some such restriction 
that we can appeal to the uniqueness of representation of a poly- 
nomial in ¢. But the result holds equally if the coefficients A, are 
themselves polynomials in an indeterminate x with integral co- 
efficients, for then it holds for every integral value of x, and there- 
fore identically in x. 

We apply this, with e = 2, to the polynomial 


F(C) = I] Gon 


When written in the standard form, the coefficients A,(x) are poly- 
nomials in x with integral coefficients. If m is any integer with 
v(m) = 0(mod 2), then m is a quadratic residue, and 


F(¢") = [[(x — ®") = [|] @ — ¢*) = FO). 
R R 
Hence F(¢) has the property postulated above, and it follows that 


F(C) = Ao(x)no + Ail). 


{Actually Ao(x) = A(x), in the notation we have just been using.] 
Substituting the values of yn) and ,, we obtain 


[] @ — &*) = SAo(x)(—1 + €q*) + Ai (x)(—1 — €q) 
R 
= 3 ¥(x) — eq*Z(x)], 
where Y(x), Z(x) are polynomials with integral coefficients. If we 
replace € by C*, then C® becomes (%, where N is a typical quadratic 


nonresidue, and yo and ny, become interchanged. This has the 
effect of changing ¢ into —e. Hence 


[1 — 0%) = a ¥(@) + eq*Z(x)]. 
N 
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This proves the result quoted in §1; it was used there in the case 
q = 1(mod 4), and so with « = 1. 

I shall now discuss two topics connected with cyclotomy, for 
the sake of their intrinsic interest. They are (a) Gauss’ theorem 
on the roots of unity of order g, when q is a prime of the form 
2* + 1, and (b) Kummer’s problem on the cubic periods. 


GAUSS’ THEOREM 


Gauss’ theorem asserts that if q is a prime of the form 2* + 1 
(e.g., if g is 3 or 5 or 17 or 257 or 65537), each qth root of unity 
can be expressed in terms of rational numbers by using a succes- 
sion of square root signs. From this assertion, with a few supple- 
mentary observations, one deduces that, for the values of q in 
question, it is possible to inscribe a regular polygon of q sides in 
a given circle by a Euclidean construction using ruler and com- 
passes only. 

We consider the various choices of e and f that are possible: 


e, = 2, f, = 3q - 0); 
e, = 4, f, = 4a — 1); 


1 
e, = 2", f= xd - Y= 1. 


For the choice e,, there are e, Gaussian periods of f, terms, which 
we Shall denote by 

ee ns? (e = e,) 
to indicate the dependence on r. 

We have already evaluated the two periods n{!?, yn, and they are 
y-1 + / 4), where q’=q if q=1(mod4) and q' = —q if 
q = 3(mod 4). The latter cannot happen if g > 3. 

Now consider the four periods n??, 19, n2?, n'?). By definition, 


y?) = Y cr 
v(n) = j (mod 4) 
The expression 
(x — nP)(x — nf), 
considered as a polynomial in ¢, is unaltered if we replace { by 
¢™, provided v(m) = 0(mod 2), for the effect of this is either to leave 


n? and n¥ unchanged or to interchange them. Hence, by our 
earlier result, 


(x — nP)(x — n§) = Ay (x)? + Arn”, 
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where A,(x), A,(x) have integral coefficients. It follows that the 
coefficients of the quadratic in x on the left are expressible by rational 
numbers and ./q’. Hence n'2), n'?) are expressible by means of 
two square root signs, and similarly for n¥, 7. 

The argument continues; at the next step, the eight periods fall 

into the four groups: 

n?, ns nP?, ns nD, ns n,n; 

and the two in each group can be evaluated in terms of the four 
periods n'?) by use of another square root sign. 

Finally, we come to the 2" periods of one term; these are just 
(, C?,..., C64” '. Thus each of these is expressible by means of rational 
numbers and k square root signs. The k ambiguities of sign attach- 
ing to the square roots give the 2(= q — 1) roots of unity. 

This proves Gauss’ theorem in its first form. For the inscription 
of a regular polygon of q sides in a circle, it suffices to have the 
number cos 2z/q, which determines the first point of sub-division 
of the circle. Now 

2 cos 2n/qg=C€+ C7}, 


and this is one of the periods of two terms which arise at the penulti- 
mate stage of the preceding construction, for then e = 2*~! = 
3(q — 1), and the exponents 1 and —1 on the right above are just 
the values of n for which the index v(n) is divisible by e. Thus we can 
construct the length 2cos 2z/q from a unit length by solving a 
succession of quadratic equations. But in order that this con- 
struction shall be capable of realization geometrically, it is neces- 
sary that all the quadratic equations shall have real roots. Thus 
we need to know that all the periods n'’, with r < k — 1, are real. 
This is in fact the case. For if 7 is one such period, then 7 is obtained 
from n by changing ¢ into (~', and this has the effect of replacing 
v(n) by v(—n). Now g?"' = g#4-) = —1 (mod q), and therefore 
v(—1) = 2*~!, and so is divisible by e for each of the values e = 2, 
4.....2""1. Hence the condition of summation v(n) = j(mod e) is 
unaltered if v(n) is replaced by v(—n), and therefore yn = n, that is, 
n is real. 


KUMMER’S PROBLEM 


Kummer’s problem relates to the three periods of 4$(q — 1) 
terms that exist when q — 1 is a multiple of 3. These are 


mn=> 04, m=O made, 
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where A runs through those numbers n of the set 1, 2,...,q — 1 whose 
indices are divisible by 3, and B through those whose indices are 
= 1 (mod 3), and C through those whose indices are =2 (mod 3). 
The numbers A constitute the cubic residues (mod q), and the 
numbers B and C constitute the two classes of cubic nonresidues. 

If we choose ¢ = e?"'/4, as we shall do henceforward, the value 
of yo is uniquely determined, and in fact 

q-1 


1+ 3%. = ¥ e,{x°), 
x=0 
since the function x° assumes the value 0 once and assumes each 
of the values A three times. But the values of 7, and 7, cannot be 
distinguished from one another unless we specify also the primitive 
root by which the index is defined (and, as far as I know, there is 
no simple and general way of doing so). 

The values of yo, 7,, 142 can be expressed in terms of a Gaussian 
sum which is similar to the sum G defined in (4) of §1, but 1s formed 
with a cubic character instead of a quadratic character. Let w be a 
complex cube root of unity, and define 


x(n) = a 


for n ¥ 0(mod q), and put y(n) = 0 for n = 0(mod q). Define 


qa-1 
t= » x(n) e,(n). 


We first prove that |t| = g?. We have 


qa-1 q-1 


|t| = >» 3 x(n, )x(ng)e,(n — N), 


nmyp=1n2=1 
and, with a computation similar to that at the beginning of §2, 
this is 
q-1 q-1 


Y ¥ Aanje(n, — nn,) = qx) + x(n)(—1) = q. 


m=1 n=1 
This proves the assertion; and we can now write 
t = gte’®, 


where 6 = 6(q). 8 is uniquely determined, as an angle, except for 
sign, for the only ambiguity in the definition of t lies in the possibility 
of replacing y by x, and this has the effect of changing t into T 
[since y(—1) = 1 for a cubic character] and so of changing @ into 
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The expressions for 4, 41,42 in terms of t are very easily derived. 
It is convenient to put 


1+3n4,=2z, (j= 0,1,2). 


Then 
q-1 
Zo= > [1 + x(x) + X@)le(x) = t + t = 2g? cos 0. 
x=0 
Similarly 
q-1 
Zz, = DY [1 + w*x(x) + wx(x)Je,(x) 
x=0 
2 
= w’*t + wt = 2q* cos 0 - 2] 
and again 
q-1 
z= Y [1 + wyx(x) + w7X(x)Je,(x) 
x=0 


2 
wt + wt = 2q* cos 6 + = 


Kummer’s problem is essentially that of determining the distri- 
bution of the angle 6 = 0(q), or rather of cos 0, as q runs through 
the primes. But he put the problem in a more specific form. The 
three numbers Zo, z,, Z, are the roots of a cubic equation with 
integral coefficients, since this is true of the three periods o, 1, 12. 
It follows from the above expressions in terms of @ that there is 
just one of the three numbers Zp, z;, Z, in each of the intervals 


(-2/4-J/9), (-J/aJSa,  (/42/4): 


Kummer asked: With what frequencies does the number Zy, which 
is uniquely defined for each gq, fall in each of these intervals? On 
somewhat limited numerical evidence he conjectured, very tenta- 
tively, that the relative frequencies may be in the ratios 1:2:3, 
but more extensive computation by Mrs. Lehmer? made this 
appear unlikely. Recently, Heath-Brown and Patterson*® have 
shown that the @ are uniformly distributed, so that the limiting 
ratios are 1:1:1. In their proof they use, among other things, the 
techniques which we develop in §24. 

It may be of interest to show that cos 36 can be expressed in 
terms of the representation of q in the form 


4q = a? + 27b’; 


2 Math. Tables and Aids to Computation, 10, 194-202 (1956). 
3 J. reine angew. Math., 310, 111-130 (1979). 
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it is easily proved that this representation is unique, except of course 
for the signs of a and b. We have (with variables of summation 
running from 1 to q — 1) 


?v=y DY Hx)x(V)eq(x + Y) 
YE VexHe{xU + 1] 
Y uty AxYe,[xU + 9] 


tx#-1 x 


YS x(t)x. + tt. 


t#-1 


N 


I 


I 


Multiplying by t, we get 


T= q) xt(1 + t)] = q(A + Bo), 


where A and B are integers. Obviously 
tT? = q(A + BO), 
and on multiplying the two equations together we get 


q = (A + Bw)(A + BO) = A* — AB + B?, 
or 
4g = (2A — B)? + 3B?. 
We now prove that Bis divisible by 3, this being necessary because 


without this stipulation the representation of q in the above form is 
not unique. We observe that t is an algebraic integer, and that 


3 
=| PY xoelx)] =F 2de(3x) + 34, 


where ¢ is an algebraic integer. Subtracting from this the corres- 
ponding equation for 7°, we see that t? — 7° is divisible by 3. But 


1? — f° = qBw — @) = igBV/3; 


hence the rational integer B is divisible by 3. 
We now have 


4g = a? + 27b’, 
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where 
a= 2A — B, b = 4B; 
and 
gie® = 1° = g(A + Ba) = 4q(a + iB, /3). 
Hence 


cos 39 = sige 


2./q 


This determines cos 36 except for sign. But the ambiguous sign, 
arising from the unknown sign of a, can also be specified, for it can 
be shown that 


a = | (mod 3). 


To prove this, we consider the number N of solutions of the congru- 
ence 


ve=ulu+1)  (modq). 


For u =0 or —1 there is just one value of v, and for any other u 
there are either three values of v or none. Hence N = 2 (mod 3). On 
the other hand, 


N= y {1 + yfuu + 1)] + y7[uu + I} 


=q+(A+ Bw) +(A+ Bo*)=q+a. 


Hence a = 2 — gq = 1 (mod 3). 
In conclusion we remark that the cubic equation mentioned 
earlier, whose roots are Z,, Z>, 23, 1S simply 


z> — 3qz — qa=0. 


This follows easily from the expressions for the z; in terms of t and T. 
We have 


yz; = (t+ 7) + (wt + wi) + (wt + w*7) = 0, 
y zy = (t+ Tt)? + (wt + wi)? + (wt + wt)? 


= 61T = 6q, 
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and 
212923 = 7 +7? = g(2A — B) = ga. 
For further information on Kummer’s problem, see Mathews, 


Theory of Numbers, §§ 196 and 197; and Hasse, Vorlesungen iiber 
Zahlentheorie (2nd ed., 1964), §20.6. 


4 


PRIMES IN ARITHMETIC 
PROGRESSION: THE GENERAL 
MODULUS 


Dirichlet’s proof of the existence of primes in a given arithmetic 
progression, in the general case when the modulus g is not neces- 
sarily a prime, is in principle a natural extension of that in the special 
case. But the proof given in §1 that L,(1) 4 0 when w = —1, which 
involved separate consideration of the casesg = 1 andg = 3 (mod 4), 
does not extend to give the analogous result that is needed when gq is 
composite. 

We now suppose that q is any positive integer other than 1. (We 
do not exclude q = 2, as we did in §1, though it will in fact be a 
trivial case.) 

The functions that take the place of the functions w””, where 
w?~' = 1, are Dirichlet’s characters to the modulus q. These are 
functions of an integer variable n which are periodic with period q 
and multiplicative without restriction. The typical function is 
denoted by x(n); it is defined initially when n is relatively prime to 
g, but the definition is then conveniently extended by defining y(n) 
to be 0 when (n, gq) > 1. The number of these functions will be (q). 

Dirichlet’s characters to a given modulus can be regarded as a 
particular case of the characters of an Abelian group, the group 
in question here being that of the relatively prime residue classes 
(mod q) combined by multiplication. But | shall follow Dirichlet in 
giving a direct and constructive account of them. This is partly for 
historical reasons, in that Dirichlet’s work preceded by several 
decades the development of group theory, and partly for a mathema- 
tical reason, namely that the group in question has a simple and 
interesting structure which is obscured if one treats it as one treats 
the general Abelian group. 

Consider first the case when g is a power of a prime other than 2, 
say q = p*. Here the construction of §1 extends quite naturally. 
There is a primitive root, and the theory of the index applies, the 
only difference being that the modulus to which the index is defined 
is now ¢(p*) = p* ‘(p — 1) in place of p — 1. We define the charac- 
ters to the modulus p* by taking any real or complex number w 
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which satisfies 


mer”) — ie 
and putting 
y(n) = wo” for (n, p) = 1, 


where v(n) denotes the index of n relative to a fixed primitive root of 
the modulus p*. The number of characters is @(p’). 

The position is a little more complicated when g = 2”. If a = I 
there is only one relatively prime residue class, and only one charac- 
ter, the value of which is always 1 (for odd n, of course). If « = 2, so 
that ¢(2%) = 2, there is a primitive root, namely —1, and the pre- 
ceding construction applies: the characters are w””’, where w? = 1. 
The effect is to give two characters, one of which is always | and the 
other of which is 1 or —1 according as n = 1 or —1 (mod 4). But if 
a > 3 there is no primitive root (mod 2’): as a substitute for this we 
have the fact that every relatively prime residue class is representable 
uniquely as 


(- 1)'5": 
where v is defined to the modulus 2 and v’ is defined to the modulus 


+ $(2") = 2*-7. By analogy with the previous construction, we 
define the characters in the present case by 


x(n) = w'(w’)”, 
where 
w* = 1 and = (w’)**" * = 1. 


The number of characters is 27” ' = $(2?). 
In the general case, when 


q = 2*pi'pr..., 


we define the characters to the modulus q as products of arbitrary 
characters to the various prime power moduli. If x(n; 2*) denotes any 
character to the modulus 2%, and similarly for the other prime 
powers, the general character to the modulus q is given by 


x(n) = x(n; 2*)x(n; pi')x(n; p?).-., 


provided (n, g) = 1. (The last proviso could be omitted, for if n has 
a factor in common with q, one of the characters in the product on 
the right will be 0.) The total number of characters is 


(27) (pi) P(P?).- = P(Q)- 
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It is plain that these characters are distinct arithmetic functions (this 
being a consequence of the fact that each index assumes all values to 
its appropriate modulus), and that each function is a periodic and 
multiplicative function of n. One of the characters, got by taking all 
the w’s to be 1, has always the value 1, for (n,q) = 1, and this is 
called the principal character and denoted by yo. 

The characters to a given modulus gq form themselves a group 
under multiplication, with the principal character as the unit 
element. This group, which has ¢(q) elements, is in fact isomorphic 
to the multiplicative group of the relatively prime residue classes 
(mod q). The isomorphism is most easily demonstrated by re- 
writing the definition of y(n) in terms of the complex exponential 
function. For the modulus p’*, we have 


wy = ertinis™ = eLm/g(p")] 


and the different choices of w correspond to different choices of the 
integer m to the modulus ¢(p*). So 


x(n; p*) = elas sb 


where v is the index of n relative to a particular primitive root of p*. 
In the case 2%, we have 


y(n; 2*) = e\— + 


my m'yv' 
2 + 2-3) 


where n = (—1)” 5” (mod 2’). Putting these formulas together, we 
get 


MoV Mov, m,v mv 
(1) xn) = o| 2 4 Tee “+ oe 


2 227? G(Pi')— P(p#) 


for (n,q) = 1, where mo, mo, m,,m)... are integers which take all 
values modulo the corresponding denominators. The definition is 
symmetric in the m’s and the v’s, and we see that multiplication 
relative to n (with x fixed) corresponds to addition of the vectors 


(Vo, Vos Vi, Vay) 
and that multiplication relative to y (with n fixed) corresponds to 
addition of the vectors 


(Mo, Mo, M,, Mp,...), 


in each case with respect to the appropriate moduli. This duality 
renders visible the isomorphism mentioned earlier. We have 
assumed above, for simplicity of exposition, that the exponent 
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a in 2% is at least 3. It will of course be plain that if « = 2 the second 
of the two terms corresponding to 2% is to be omitted, and that if 
a = 1 bothtermsare to be omitted. 

The characters have an important property that can be expressed 
in either of two equivalent forms. In the first form, it states that 


$9) if X = Xos 
(2) » xn) = ‘ otherwise, 


where the summation is over any representative set of residues 
(mod gq), though it suffices to take a set of relatively prime residues, 
since y(n) = 0 for the others. The truth of the above statement is an 
immediate deduction from the representation of the general charac- 
ter in (1). For the summation over n is equivalent to a summation 
OVEF Vo, Vo; V4, V2»--, ach to its respective modulus, and this gives 0 
unless each of mp, mo, M,, M,,... is congruent to 0 with respect to its 
corresponding modulus. In that case, y = 79, and all the values of 
x(n) are 1 for n relatively prime to q, and the value of the sum is $(q). 
The second form of the property is that 


$(q) if n = 1 (mod q), 
(3) d x(n) = ‘s otherwise, 


where the summation is over all the ¢(q) characters. The same proof 
applies, but with the m’s and v’s interchanged ; the only case in which 
the sum does not vanish is that in which all the v’s are 0, and then 
n = | (mod q). It may be of interest to remark that if the characters 
are defined axiomatically, that is, by their periodic and multipli- 
cative properties, instead of by construction, then (2) is readily 
deducible from the definition but (3) is not. To prove this, one has 
either to use similar ideas to those we have used in the construction, 
or to appeal to the basis theorem for Abelian groups. 

Using (3), we can prove that any arithmetic function X(n) that is 
multiplicative and has period q, and is 0 when (n, q) > 1 but not always 
0, is one of the @(q) characters y(n). For if (c, q) = 1, we have 


Y X(n)x(n) = Y X(en)xlen) = X(c)x(c) YX (n)x(n). 


Unless X(c) = y(c) for all c, the sum must be 0. If this is so for each y, 
then 


0 = ¥ xm) Y X(n)x(n) = ¥ X(1) Y xm) = G(q)X(m). 


This gives X(m) = 0 for all m, contrary to hypothesis. 
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We can also use (3) to construct, as in §1, a linear combination of 
the characters which selects those integers n which fall in a given 
residue class (mod q). If (a, q) = 1, then 


if n = a(mod q), 
oe $(q gy 2 OM) = “$0 otherwise; 
for we have x(a)y(n) = y(n’), and n’ = 1 (mod q) if and only ifn = a 
(mod q). 


The L functions for a general modulus g are defined, in the first 
place for s > 1, by 


Lis,x) = ¥ x(n)n~*. 
n=1 


As in §1, each of them has an Euler product expression: 


Ls, x) = [][1 — x(p)p-)7, 


P 


and L(s, vy) # Ofors > 1. We have 


log L(s, x) = YY m™*x(p™)p-™, 


p m=1 


and on forming a linear combination of these logarithms and using 
the relation (4), we obtain 


1 
(5) ——~ ¥° x(a) log L(s, x) = d oy m ‘po, 
(9) Z 
pme ovodia) 
As in §1, the right side is 
~ p+ O(l) 
p=a(mod 4q) 


as s > 1 from the right. Thus our object, as before, is to prove that 
the left side of (5) tends to + 00 as s — 1 from the right. 
The term corresponding to the principal character 79 is 


1 
— log L(s, Xo). 
gq)” 
By the Euler product formula, we have 
(6) L(s, Xo) = Ss) [] — p~), 
plq 


and therefore log L(s, 79) ~ +0 as s— 1. It therefore suffices to 
prove that, for y ¥ Zo, log L(s, x) is bounded as s — 1, and again this 
is equivalent to proving that L(1, x) ¥ 0. 
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If y is a complex character, that is, a character whose values are 
not all real, so that ¥ ¥ x, this follows as in §1 from the inequality 


[]|Lts, | = 1 fors > 1, 
Xx 
which is proved in the same way as before by taking a = 1 in(5). 

It remains to prove that L(1, x) # 0 when x is a real character 
other than the principal character. Dirichlet deduced this from his 
famous class-number formula, an account of which will be given in 
§6. But to complete the proof of Dirichlet’s theorem now, I shall 
deviate from the historical order of discovery and give a simple 
proof due to de la Vallée Poussin,! which is based on complex 
function theory. 

For this proof we need to know a little about the behavior of the 
L functions as functions ofa complex variable s. We write s = o + it, 
as is customary in this subject. The series which defines L(s, x) is 
absolutely convergent for o > 1, and is uniformly convergent with 
respect to s fora > 1 + 6 for any positive 6. Hence the L functions 
are defined for o > 1 and are regular functions of s there. We can, 
however, easily prove that each of them can be continued analytical- 
ly so as to be regular for o > 0, except that L(s, y,) has a simple pole 
ats =1. 

We deal first with L(s, yo), and in view of the simple relation 
between L(s, yo) and ¢(s) given in (6), it will suffice to consider {(s). 
We transform the definition ¢(s) = Xn~*, which is applicable for 
ao > 1, into a form that. is applicable more generally for o > 0. 
This is done by partial summation, but it is a technical convenience 
to use integrals rather than sums. We have 


G(s)= Y n= ¥ n[n-s—(n + 175] 


n=1 n=1 
ao nt 
=5§ yn} xo Tax 
n=1 °" 
+00 
=s| [x]x7%7! dx. 


“1 


We now put [x] = x — (x), so that (x) denotes the fractional part of 
x. This gives 


(7) (s) = — - s| (x)x 757! dx. 
1 


"Recherches analytiques sur la théorie des nombres premiers.”” Deuxiéme partie. 
Ann. Soc. Sci. Bruxelles, 20, 281-362 (1896). 
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The integral on the right is absolutely convergent for o > 0, and 
uniformly for ¢ > 6 > 0, and so represents a regular function of 
s foro > 0. Thus ¢(s) is meromorphic for o > 0, its only pole being a 
simple pole at s = 1 with residue 1. In view of (6), the same is true 
of L(s, yo), except that the residue is 

[]G — p') = 47 '¢(q). 

pla 

There is a similar calculation when y ¥ yo. If we define tempo- 

rarily 


S(x) = YF x(n) 
then : 
Lis, x) = ¥ x)n-s = ¥ S(n)[ws — (n + 1)75] 
n=1 n=1 
(8) 


= s{Stxjx-s-! dx, 


for o > 1. Since X F Xo, it follows from (2) that Ly(n) over any q 
consecutive integers is 0, and therefore that S(x)is a bounded function 
of x. Thus the last integral gives the analytic continuation of L(s, y) 
as a regular function for a > 0. 

Suppose now that x is a real nonprincipal character (mod q) and 
that L(1, xy) = 0. Then L(s, x) has a zero at s = 1, and the product 


L(s, x)L(s, Xo) 


is regular at s = | and therefore regular for o > 0. Since L(2s, yo) is 
regular and different from 0 for o > 5, the function 


L(s, x)L(s, Xo) 


> L(2s, Xo) 


is regular for og > 3. We observe further that y(s) > 0 as s > $ from 
the right, since L(2s, x9) > +0. | 
The Euler product formula for y(s) contains only factors corres- 
ponding to primes that do not divide g, and indeed contains only 
factors corresponding to primes for which y(p) = 1, since if x(p) = —1 
the factor is 
(1+ poy = poy! 


=f, 
(1 _ py 


Thus we get 
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This holds foro > 1. If there were no primes with y(p) = 1 we should 

have y(s) = 1 for all o > 1, and therefore by analytic continuation 

for all o > 4, and this is contrary to the fact that p(s) — 0 as s > 4. 
Plainly the above product can be written as a Dirichlet series: 


% 
y(s)= )) ayn *, 
n=1 
where a, > 0 and a, = 1. This series is only valid, however, for 
o > 1 (as far as we know). 
Since (s) is regular for o > 4, it has an expansion in powers of 
s — 2 with a radius of convergence at least 3. This power series is 


fe 0) 


1 
ws) = Yo — ymays — 2)" 


m=0 
We can calculate y"(2) from the Dirichlet series, and we obtain 


ie.0) 


yim(2) = (+1)" Yi a,(log n)"n~? = (— 1)", 


n=1 


say, where b,, > 0. Hence 
a | 
/(s) = eee aa Ss) ’ 


and this holds for |2 — s| < 3. If 4 < s < 2, then since all the terms 
are nonnegative we have 


W(s) > W(2) > 1, 


and this contradicts the fact that p(s) 0 as s — 4. Thus the hypo- 
thesis that L(1, y) = 0 is disproved.” 

We have therefore completed the proof of Dirichlet’s theorem that 
there are infinitely many primes p = a(mod q), and the series Xp”! 
summed over such primes is divergent. 


? A somewhat different proof, but on similar general lines, was given by Landau 
in 1905 (see, for example, Prachar, Chap. 4, Satz 4.2). There is also an elementary 
but rather complicated proof due to Mertens, which will be found in Landau, Vorle- 
sungen I, Satz 152. 
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PRIMITIVE CHARACTERS 


Many results about characters and L functions take a simple form 
only for the so-called primitive characters, though they may be 
capable of extension, with complications, to imprimitive characters. 
We shall now explain the distinction between these two types of 
character, and afterward investigate in detail the real primitive 
characters. 

Let y(n) be any character to the modulus q other than the principal 
character. If (n,q) > 1, then y(n) = 0: if (n,q) = 1, then y(n) # 0, 
being a root of unity. and is a periodic function of n with period gq. 
It is possible, however, that for values of n restricted by the condition 
(n, gq) = 1, the function y(n) may have a period less than q. If so, we 
say that y is imprimitive, and otherwise primitive’ It is a matter of 
personal preference whether one includes the principal character 
among the imprimitive characters; I prefer to leave it unclassified. 

Let y(n) be a nonprincipal character to the modulus gq which is 
imprimitive, and let q, be its.least period. Then q, < gq: andq, > 1, 
for otherwise we should have x(n) = y{1) = 1 for all n satisfying 
(n, q) = 1, contrary to the supposition that y is not the principal 
character. Further, qg, is a factor of g, for by a familiar argument if g 
and q, are periods then so is (q,q,), and therefore this number 
cannot be less than q,. 

We shall prove that x(n) is identical, when (n, q) = 1, with a charac- 
ter x,(n) to the modulus q,; but before we can prove this we must 
define y,(n). Of course, we define y,(n) to be y(n) if (n, q) = 1; and if 
(n, q,) = 1 but (n, q) > 1, we choose any integer t such that 


(1) (n + tq,,q) = 1 


and define y,(n) = x(n + tq,). Such an integer exists, for it suffices to 
have 


(n + tq,,r) = 1, 


! Alternative terms are improper and proper. 


35 


36 MULTIPLICATIVE NUMBER THEORY 


where? r is the product of those prime power constituents of q which 
are relatively prime to q,. The choice of t, subject to (1), is immaterial, 
since the value of y(n + tq,) will be the same. 

We have now defined x,(n) when (n, g,) = 1, and of course we take 
it to be O when (n, q,) > 1. Plainly y,(n) is periodic with period q,, 
and its multiplicative property follows easily from that of x(n). 
Further, y,(n) is not always 0 when (n, q,) = 1, for ¥,(1) = y(1) = 1. 
Hence, by a result proved in §4, it is one of the @(q,) characters to 
the modulus g,. The values of 7,(n) when (n,q,) = 1 include the 
values of y(n) when (n, q) = 1, and so cannot be periodic with period 
less than g, ;norcan they all be 1. Hence x,(n) is a primitive character 
to the modulus g,. We have now proved that to an imprimitive 
character x (mod q) there corresponds a proper factor q,.of g and a 
primitive character x, (mod q,) such that 


m= if (n, q) = 1, 
seal Et if (n,g) > 1. 


We say that y, induces yx. It is clear that if qg, and y, are given, and q 
is any proper multiple of g,, the above definition of 7 does in fact 
produce a character (mod q). 

For example, the Legendre symbol (n|p) is an imprimitive charac- 
ter (mod p*) if « > 1, being induced by the same character (mod p); 
but this is a particularly simple case, since here the conditions 
(n, gq) = 1 and (n,q,) = 1 are synonymous. Or again the Legendre 
symbol (n|p,) induces an imprimitive character to the modulus 
P1P2 (where p, # p,) by the definition 


(2) 


n : 
y(n) a ("| if (n, PiP2) oS: I, 
0 if (n, PyP2) > 1. 


As we saw in §4, any character (mod q) is representable as 
x(n) = x(n; pT')x(n; p?)... 


where g = p{'p3’.... and the characters on the right are to the moduli 
indicated. (We allow p, to be 2 here.) It is easily seen that x is primitive 
if and only if each of the characters on the right is primitive. If x is 
imprimitive, one or more of the characters on the right is either 
principal or imprimitive, and in the latter case y(n; p*) = y(n: p*), 
where 1 < B <a. Then q, is the product of the prime powers 
pi, and x, is the product of the characters x(n; p*), but omitting 
any factors that are principal characters. 


* + is not the same as qh(q. 41). 
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Expressed in terms of the representation by the complex ex- 
ponential function, in (1) of §4, a character is primitive if and only 
if all the m; are relatively prime to the corresponding p,; (with an 
obvious modification for m) and mo depending on whether a > 2 
or a = 2). 

The relation (2) between an imprimitive character y and the 
primitive character x, which induces it implies a simple relation 
between the corresponding L functions. By the Euler product 
formula, 


(3) L(s, x) = M[1 — xp)p*]' 
pla 
= ml — x(p)p*]-! 
= L(s, x1) IE[1 — x4(p)p” *). 


pla 


The above argument is valid only for o > 1, where the infinite 
products converge; but by analytic continuation the result re- 
mains true for o > 0, and indeed in the whole s plane, as we shall see 
later. In particular, L(1, y,) # O implies L(1, y) # 0. 

We now turn to the real primitive characters, which are of 
particular interest in several ways. The obvious question is: For 
what moduli does there exist a real primitive character (or possibly 
more than one), and how can such characters be expressed in terms 
of quadratic residue symbols? The general nature of the answer 
is that only for certain types of g does a real primitive character 
exist, and it is then expressible (for n > 0) as 


d 
x(n) = “| 
n 


where the symbol on the right is Kronecker’s extension of Legendre’s 
symbol, and d = +q. In some cases, but not in all, d can be both 
+g and —q, and then there are two. characters. 

We have seen that a primitive character (mod q) is a product of 
primitive characters with the prime power constituents of g as 
moduli. Consider first a prime power p* for which p > 2. The 


character is 

mv(n) 

Oe as A * for (n, p) = 1. 
F ‘(p - ”" 


Since e(x) is real only if x = 0 or $ (mod 1), and since a possible value 
of v(n) is 1, this is a real function only if m is divisible by $p*~ '(p — 1). 
We must therefore have a = 1, for if « > 1 we should have m 
divisible by p and the character would be imprimitive. We can take 
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m = 4(p — 1), since m = 0 would give the principal character, and 
now the function becomes 


1 a ay | vin) *} 
e[zv(n)] = (—1) (" 


Thus p* must be p and y(n; p*) must be (n|p). 

Now consider the modulus 2%, where « > 2 of necessity, since for 
a = 1 there is only the principal character. If « = 2, there is just 
one nonprincipal character, namely 


_{ 1 ifn = 1 (mod 4), 
(4) ta(n) = -1 ifn = —1(mod 4), 


and this is obviously primitive. If « > 3, the general character is 
(™ i =) 
2 Ja-2 J 
where 0 < m < 2,0 < m' < 2%°?, and v, v’ are defined by 


n=(—1)5" — (mod 2°). 


The character can only be real if m’ is divisible by 2*~ 3, and if « > 3 
this implies that the character is imprimitive. We must have « = 3, 
and there are the two possibilities mp = 0, mp = 1 and my = 1, 
m, = 1. [The other possibility, mp = 1, mp = 0 leads to x,(n), 
which is imprimitive to the modulus 8.] The first of these gives a 
character, which we shall denote by y,(n), according to the rule 


1 ifn = +1 (mod 8), 
(5) s(n) = Vjctas 
—1 ifn = +3(mod 8); 
and the second possibility gives 7,(n)yv,(n). Both these are primitive. 
Thus the only prime power moduli to which there exist real 
primitive characters are: 


p(>2) with the character (n|p), 
(6) 4 with the character y,(n), 
8 with the characters y,(n) and y4(n)7(n). 


A real primitive character exists to the modulus q if and only if q 
is a product of such moduli, subject to the factors being relatively 
prime, and the character is then the product of the corresponding 
characters given above. There are two of them if and only if q 
includes the factor 8. We shall call the moduli listed above the 
basic moduli and the characters the basic characters. 
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We can express most of the basic characters, if we limit ourselves 
to positive values of n, in terms of Jacobi’s symbol (min), which 1s 
defined (by multiplying together the corresponding Legendre 
symbols) when n is odd and positive. We have* 


n 
2: 2 
(7) Xa(n) = . 


—2 —8 
Xal(n)xg(n) = {7 = -), 
n 


provided n is odd, which it naturally is when the modulus is 4 or 8. 
We also have, by the law of quadratic reciprocity,* 


8) (" Z Z 
p n 
provided n is odd. But here the limitation to odd n is an undesirable 


restriction. It is removed by employing Kronecker’s extension of 
Legendre’s symbol, by which one puts 


, where p’ = (—1)?~ 2p, 


and, more generally, 


ba, . FI 


With this extension, relation (8) holds whether n is odd or even. 
It holds also in the more general form 


P’ 
(3 = (*), where P’ = (— 1)? Pp, 
P n 


if P = p,p>.:.; that is, if P is any square-free odd positive integer; 
for then P’ = p‘p>.... 

We have now expressed all the basic characters by quadratic 
residue symbols; they are 


Cc 


* Landau. Vorlesungen, Satz 92 and Satz 93. 
* Landau. Vorlesungen. Satz 95. 
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and the modulus of each character is the absolute value of the upper 
number. Moreover, we have 

> d,\ [a2 

Ant\n 


dd, 
. 
provided d, and d, are relatively prime. This is a consequence of 
the multiplicative property of the Jacobi symbol (and so of the 
Legendre symbol) if n is odd, and a consequence of Kronecker’s 
definition if n is even (which it can only be if d,, d, are odd). 

It follows that the real primitive characters are identical with the 
symbols (d\n), where d is a product of relatively prime factors of the 
form 
(9) —4, 8 -8 (-1)"°"p (p> 2); 


and the symbol is a real primitive character to the modulus |d\. 

There is an intimate connection between the real primitive 
characters and the theory of binary quadratic forms, or the equivalent 
theory of quadratic fields. We prove, in the first place, that the 
numbers d described above are identical with the numbers that 
arise as fundamental discriminants in the theory of quadratic forms, 
or as discriminants in the theory of quadratic fields. 

The numbers (— 1)?~ "p are all congruent to 1 (mod 4), and the 
products of relatively prime factors (i.e., distinct factors) each of 
this form comprise all square-free integers, positive and negative, 
that are congruent to 1 (mod 4). In addition, we get all such numbers 
multiplied by —4, that is, all numbers 4N, where N is square-free 
and congruent to 3 (mod 4). Finally, we get all such numbers multi- 
plied by +8, which is equivalent to saying all numbers 4N, where 
N is congruent to 2(mod 4). Thus we get (a) all integers, positive 
and negative, that are =1!(mod 4) and square-free, and (b) all 
integers, positive and negative, of the form 4N, where N = 2 or 3 
(mod 4) and square-free. 

These are just the discriminants of quadratic fields. For a quadratic 


field is generated by JN, where N is a square-free integer (positive 
or negative); and an integral basis of the field is given by 


(,./N) if N = 2 or 3(mod 4), 


(,4+4/N) — if N = 1 (mod 4). 


The discriminant, being the square of the determinant formed 
by an integral basis and the (algebraically) conjugate basis, is 4N 
in the first case and N in the second case. Hence the discriminants 
are just the numbers described in (a) and (b) above. 
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In the theory of quadratic forms, the discriminant of 
ax? + bxy + cy? 


is the familiar algebraic invariant D = b? — 4ac. In this theory one 
presupposes that D is not a perfect square, since in that case the 
form has rational linear factors. Thus a discriminant is an integer, 
not a square, which is congruent to 0 or | (mod 4). A fundamental 
discriminant is one which has the property that all forms of that 
discriminant have (a, b,c) = 1. We can easily prove that the funda- 
mental discriminants are just the numbers d described in (a) and 
(b). First, if D = d and (a,b,c) = m > 1, then m? divides d, and 
therefore d must be of the type (b) and m must be 2. But then a = 2a,, 
b = 2b,, c = 2c,, and 


b? — 4a,c, = id =N, 


which contradicts the fact that N =2 or 3(mod 4). Second, if 
D # d, we easily see that D = dm? for some m > 1, and then there 
is either the imprimitive form with coefficients 


mm, —4m(d — 1) 
or the imprimitive form with coefficients 
m, 0, —imd, 


of discriminant D. This proves the assertion. 

In the theory of quadratic fields, the value of (d|p) determines 
the way in which a prime p factorizes in the quadratic field of dis- 
criminant d; it remains a prime if (d|p) = —1, and factorizes into 
two prime ideals if (dlp) = 1. Similarly, in the theory of quadratic 
forms, p is not representable by any form of (fundamental) dis- 
criminant d if (d|p) = —1, but is representable by at least one form 
if (d|p) = 1. 

In connection with primitive real characters, it may be noted that 
x(—1) has the value +1 or —1 according as d is positive or negative. 
It is sufficient to prove this for the ‘“‘prime discriminants” listed in 
(9), as the general character is a product of basic characters, and 
both the value of y(—1) and the sign of d are multiplicative. For 
d = —4 the character is y,(n) and y,(—1) = —1. For d =8 the 


character is x(n), and y,(—1) = 1. For d = —8 the character is 
xa(n)xg(n), and y4(—1)xg(—1) = —1. For d= (—1)*?"p the 
character is (n|p), and for n = —1 it is +1 or —1 according as 


p = 1 or —1 (mod 4), that is, according as d is positive or negative. 
Hence the result. 
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Thus a real primitive character is associated with a real quadratic 
field or with an imaginary quadratic field, according to the value 
of y(—1). 

Finally, we observe that the L function of any real primitive 
character can now be expressed as 


usor~ 5 (ther aft (ey 


for a0 > I. 


6 


DIRICHLET’S CLASS NUMBER 
FORMULA 


Dirichlet’s class number formula, in its simplest and most striking 
form, was conjectured by Jacobi! in 1832 and (as we said in §1) 
proved in full by Dirichlet in 1839. 

There are two stages in Dirichlet’s work. In the first stage, the 
class number of quadratic forms of given (fundamental) dis- 
criminant d is related to the value of L(1, x), where x is the real 
primitive character (d|n). This relation renders visible the fact that 
L(1, xy) > 0. In the second stage, the value of L(1, x) is expressed 
in terms of a finite sum by an argument which Is essentially the same 
as that used in §1. 

In this section we shall give the substance of Dirichlet’s work, 
but to avoid excessive length we shall quote a number of results 
concerning quadratic forms from Landau’s Vorlesungen I. We 
cannot follow Dirichlet in detail, because he used the notation 


ax? + 2bxy + cy? 
for a quadratic form, whereas (following Lagrange and most 
modern writers) we shall use the notation 

ax? + bxy + cy’. 


The forms of given (fundamental) discriminant d fall into classes 
of mutually equivalent forms under linear substitutions of the 
type 
(1) x =ax'+ By’, y= yx’ + oy’, 
with integral coefficients a, B, y, 6 satisfying «6 — By = 1. We call 


these unimodular substitutions. As Lagrange showed, every class 
contains at least one form whose coefficients satisfy the inequalities 


|b] < lal < |cl, 


1 See p. 51 below, and Bachmann, Kreisteilung, Vorlesung 20, or H. J. S. Smith, 
Report on the Theory of Numbers, §121. 
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and it follows easily that the number of classes, for a given dis- 
criminant d, is finite.” 

If d is negative, the forms of discriminant d are definite. Half 
of them are positive definite and half are negative definite, the 
latter being obtained from the former by replacing a,b,c by —a, 
—b, —c. It is obviously sufficient to consider the positive definite 
forms, which is equivalent to saying that we restrict ourselves to 
forms with a > 0. If d is positive, each of the forms of discriminant 
d is indefinite. It is therefore equivalent to some form with a > 0, 
for we can choose some positive number represented properly by 
the form (that is, with x and y relatively prime), and any such 
number occurs as the first coefficient of some equivalent form. 
We can select a representative from each class of equivalent forms 
with a > 0, and it is convenient to do so. We denote the number of 
classes of forms (positive definite if d < 0) by h(d). 

There is always at least one form of discriminant d, namely, the 
principal form 
Q) x? —1dy? if d = 0(mod 4), 

x? 4+xy—4d—1)y? — ifd = 1(mod 4). 


Hence h(d) is a positive integer. 

In the relationship between h(d) and L(1, x), the proof of which 
represents the first stage of Dirichlet’s work, there intervenes a 
factor depending on the automorphs of the forms of discriminant d, 
that is, the unimodular substitutions that transform a form into 
itself. There are always two trivial automorphs, namely, the identity 
x =x’, y=y' and the negative identity x = —x’, y= —y’. If 
d < 0, there are in general no others, but there are two exceptions 
to this: when d = —3 or —4. In both these cases there is only one 
class of forms, represented by the principal form. If d = —3, the 
principal form is x? + xy + y’, and this has the additional auto- 
morphs 


, 


x=—-y, y=x't+ty, and x=x'+y, y= —-x' 


and their negatives. If d = —4, the principal form is x? + y?, and 
this has the additional automorph 


? Landau, Vorlesungen, Satz 197. 
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and its negative. We denote by w the number of automorphs, so 
that 


2 ifd< —4, 
(3) w=44 ifd= —4, 
6 ifd= —3. 


(Another interpretation for w is that it is the number of roots of 
unity in the quadratic field of discriminant d.) 

The position is quite different when d > 0. Each form has in- 
finitely many automorphs, and these are determined by the solu- 
tions of Pell’s equation 


(4) t? — du? = 4. 
For the form with coefficients a, b, c, the automorphs are given by? 


oo B= —cu, 


5 
0) y = au, 5 = 3(t + bu). 


The trivial automorphs correspond to the trivial solutions t = +2, 
u = 0 of Pell’s equation. The equation (4) has infinitely many solu- 
tions, and if to, ug is that solution with ty > 0, ug > 0 for which uy 
is least, then all solutions are given by* 


(6) H(t + us/d) = +[H(to + Uo/d)]", 


where n is an integer (positive or negative). That (5) actually does 
give an automorph is easily verified by factorizing the form ax? + 
bxy + cy”. We have 


(7) ax? + bxy + cy? = a(x — Oy)(x — 6'y), 
where 

, ga cba Rie et 

( ) a Ia > = Ia > 


and the effect of the unimodular substitution with the coefficients 
(5) is expressed by 


| x — Oy = H(t _ u./d)(x' — 6y’), 


x— Oy =Ht + u/ayx' — Oy’); 
the product of the constant factors is 1 by (4). 


(9) 


3 Landau, Vorlesungen, Satz 202. 
* Landau, Vorlesungen, Satz 111. 
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We now turn to the question of the total number of representa- 
tions of a positive integer n by a representative set of forms of given 
(fundamental) discriminant d. This question was answered (im- 
plicitly, at least) in the classical theory of quadratic forms, developed 
by Lagrange and further by Gauss. 

If d < 0, so that the forms are positive definite, the number of 
representations of n by any form is finite. We denote by R(n) the 
total number of representations by the various forms of a representa- 
tive set. But ifd > 0 there are infinitely many representations, since 
any one representation gives rise to an infinity of others by the 
application of the automorphs of the form. We shall select one 
representation from each such set, and call it primary, and it will 
transpire that the number of primary representations is finite. If 
x, y and X, Y are two representations of the same integer that are 
related by an automorph, then by (9) we have 


x-Oy Het u/d) X-O6Y 
x — Oy At — u./d) X —0Y. 


Let ¢ = A(ty + uo/d) > 1. Then, by (6), 


Httu/d)= +e", 4t-—u/d)= te", 
for some integer m. There is just one choice of m (for given X and Y) 
which will ensure that 
Z x—- Oy pt 
x — Oy 


(10) 


and then by choice of the ambiguous sign we can further ensure 
that 


(11) x — Oy > 0. 


A representation that satisfies these two conditions will be called 
primary. The number of primary representations of a given integer 
n by a given form is finite, since the product of the linear forms 
x — Oy and x — 6’y is n/a by (7), and their quotient is bounded 
both ways by (10). For d > 0 we denote by R(n) the total number of 
primary representations of n by a representative set of forms of 
discriminant d. 
The basic result of the theory of quadratic forms is as follows.° 


* Landau, Vorlesungen, Satz 204. 
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Ifn > O and (n, d) = 1 then 


d 
(12) Ron) = WS 4), 
where w is given by (3) ifd < 0 and w = 1 ifd > 0. 

This is proved by expressing R(n) in terms of the number of solu- 
tions of the congruence z? = d(mod 4n), and then evaluating this 
number in terms of quadratic character symbols. 

The basic idea in the first stage of Dirichlet’s work is to deter- 
mine, from the above expression for R(n), the average value of 
R(n) as n varies. It is convenient (and it suffices for the purpose 
in view) to limit oneself to values of n that are relatively prime to d. 
We have 


m,;m2<N my, 

(nd) = 1 (mimz.d)= 1° 
d 
=> =| y. La L 2 Fa, 
misJ/n\} yee m2< JN JN<my<Nim, \! 


(m>,d)= 1 
since the first sum comprises all pairs m,, m, for which m, < /N 
and the second sum all pairs for which m, > \/N. The first inner 
sum 1S 
N o(ldi) 


“yt Oleldd 


so the first double sum is 


d I 
N oe ) y (2 | + O(./'N), 
| my <J/N my 
for fixed d and arbitrarily large N. Since (d|m,) is a nonprincipal 
character to the modulus |d|, the sum of its values as m, varies over 
any range 1s bounded. Hence the second double sum is O(./N ). Thus 
= d 1 [d 
wt ¥Y R(n) = ye 3 wb + O(/N). 


n=1 m< m 


We can extend the sum over N to infinity, and the remainder is 
estimated by 


1 {d BA 
>: mila) = ON ), 


m>JN 
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on using partial summation. This again contributes an error 


o(./N) in the above asymptotic expression. In particular,«we 
conclude that 


wilt. w ual) 
oe VEN ae Sb 
(n,d)=1 

Since @({d|)/|d] measures the density of the integers n for which 
(n, d) = 1, we can express the result in the form: The average with 
respect to n of R(n) is wL(1, x), where y(m) = (d|m). 

The next step is to evaluate the average of R(n) from its original 
definition. Let R(n,f) denote the number of representations of n 
(primary if d > 0) by a particular form f of discriminant d. Then 


(14) R(n) = ¥) Rin, f), 
f 


where the summation is over a representative set of forms (with 
a > 0), so that the number of terms in the sum is h(d). We shall now 
evaluate 


and it will turn out to be independent of f Comparison of the two 
limits will give the relation between h(d) and L(], x). 
Take first the case d < 0. Then 


N 
>» Raf) 
edie 1 
is the number of pairs of integers x, y satisfying 
0 <ax? + bxy+cy?><N, (ax? + bxy + cy’,d) = 1. 


The second condition limits x, y to certain pairs of residue classes to 
the modulus |d|, and it is easily proved® that the number of these pairs 
is |d|@({d|). Hence it suffices to consider the number of pairs of integers 
x, y Satisfying 


ax? + bxy+cy?<N, x=Xo, y= yo(mod |d)). 
The first inequality expresses that the point (x, y) is in an ellipse with 


center at the origin, and as N > oo this ellipse expands uniformly. 


* Landau. Vorlesungen. Satz 206. 
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The area of the ellipse is 
2n 2n 


gn SN 
./4ac — b? |d|* 


Intuition suggests—and a rigorous proof is easily given by dividing 
the plane into squares of side |d|—that the number of points is 
asymptotic to 


N. 


1 2x 
\d|? |d|* 


as N > o0. We have to multiply this by |d|@({d|) to allow for the 
various possibilities for x9, yo. Thus the conclusion is that 


y R(n ay 


ld) |d|? 
(n,d)=1 
Comparison with (13) and (14) gives 
d 
(15) h(d) = ui mal L(y ford <0. 


Now take the case d > 0. Arguing as before, we need the number 
of integer points (x, y) satisfying 
ax? + bxy+cy><N, x—O6y>0, fea a, 
x — Oy 
and 
= Xo, y= Vo (mod d). 


The first set of conditions represents a sector of a hyperbola bounded 
by two fixed lines (or rather half-lines) through the origin. The area 
of this sector is easily calculated by changing the coordinates from 
x, y to €, n, where 

€=x — Oy, n=x— Oy. 
We have 


En) 9 9 v4 
A(x, y) a” 


In the €, 7 plane, the sector is given by 
En<Ni/aw €>0, Esn<e7F. 
These conditions are equivalent to 


0 < &€ < (N/a), E <n < min(e7é, N/a€). 
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Hence the area 1s 


ot (N/a)'2 N 
¢) an eee dé, 
| (e*S — fC) de +f (* | é 


where €, = € '(N/a)*. This is 
(e2 — 17 + (N/a) log(N/a) — (N/a) log €, — (N/a) + 364, 


which reduces to 
(N/a) log é. 


This has to be divided by d*a™' to give the area in the x, y plane. We 
have then to divide this by d? to allow for the congruences to the 
modulus d, and to multiply by d¢(d) to allow for the choices of 
Xo. Yo. This gives 
1 d) | 
lim 5 by Rin.f) = a ) ee 


rane Si 


and comparison with (13) and (14) gives 
d3 
(16) h(d) = —— L(1.y) ford > 0. 
loge 


This completes the first stage of the work, and, as we said earlier, 
the results (15) and (16) render visible the fact that L(1, x) > 0. 

There remains the question of expressing L(1, zy) by means of a 
finite sum, as was done in §1 in the particular case when |d| is a prime. 
The work is on the same general lines as there, but one needs the 
evaluation of a slight extension of Gauss’ sum. This takes the form’ 


él 


m=1 


eld|?, 


d 
e(mn/|d|) = ¢ 


where ¢ = 1 ifd > Oande =i if d < 0.1 will merely quote the final 
results? : 


|d| 


(17) Li, y) = ae 2 nt | if d < 0, 


log sin = ifd > 0. 


(8) L(,y= z5 (2 


7 Landau, Vorlesungen, Satz 215. 
8 Landau, Vorlesungen, Satz 217. 
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Here (17) is the more general form of (7) of §1 [there is similarly a 
more general form of (8)], and (18) is the more general form of (9) 
of §1. 

The case when d = —q, where q isa prime congruent to 3 (mod 4), 
is particularly simple and interesting. We suppose that gq > 3 so as 
to avoid any complication with the value of w. We have w = 2, 
and on combining (15) with (17) we get 


1 a _ 1 9= 1 
(19) h(d) = m <4) = = —- m(” ; 

~ Id m=1 m q 27 
It was this particular case of the class-number formula that was 
conjectured originally by Jacobi, and the considerations that led 
him to make the conjecture are curious. The number on the right 
of (19) is certainly an integer, say H, since by Euler’s criterion 


q-1 m q-1 

y mi—| = ¥ m*4** = 0(mod q). 

m=1 m=1 

Jacobi proved, by an ingenious argument involving products and 
quotients of Gaussian sums, that H has the following property: 
for every prime p = | (mod q), there is a representation of p!#! in 
the form 


4pl#! = x? + qy?. 


(The reader will not be surprised to learn that Jacobi was unable 
to prove that H 1s positive.) On the other hand, it can be deduced 
from the theory of quadratic forms that the same property is 
possessed by the class number h(—q). This led Jacobi to look for a 
connection between them, and after examining a number of parti- 
cular cases he formulated the conjecture that h(—q) = |H|. 


We conclude this section by stating briefly the connection between 
the theory of classes of equivalent quadratic forms and the theory of 
ideals in quadratic fields, but we shall omit the proofs.” 

Let K be a quadratic field of discriminant d and let a be an integral 
ideal in K. The general integer € of a is given by 


€ =ax + By, 


where «, f is a basis of a and x, y run through all the rational integers. 
Thus 


N¢ = (ax + By)(a’x + By), 


* See Landau, Vorlesungen III, pp. 186-198 or Hecke, Algebraische Zahlen, §53. 


52 MULTIPLICATIVE NUMBER THEORY 


and this is a quadratic form in x and y with rational integral coeffi- 
cients. All three coefficients are divisible by Na, and if we write 

NG 2 ax? + bxy + cy’, 

Na 
the discriminant of this form is d. The class to which the form belongs 
is independent of the choice of basis for the ideal, and is also the same 
for two equivalent ideals, provided that equivalence of ideals is 
defined in the narrow sense. That means that two ideals a,b in K 
are said to be equivalent if there is a number 4 of K with NA > 0 
such that 

a = (A)b 

(of course 4 need not be integral). Further, there is a one-to-one 
correspondence between a representative set of forms of discriminant 
d (positive definite if d < 0) and a representative set of ideals relative 
to equivalence in the narrow sense. 

Ifd < Othere is no distinction between equivalence in the narrow 
sense and in the ordinary sense, for then NA is necessarily positive. 
If d > Oand there is a unit in K of norm — 1, there is also no distinc- 
tion, for we can ensure that NA > 0 by multiplying 4 by such a unit 
if necessary. If d > 0 but there is no unit of norm —1, each ideal 
class in the ordinary sense comprises two ideal classes in the narrow 
sense. It follows that, if we denote by h,(d) the number of ideal classes 
in K in the ordinary sense, then: 


h(a) = h,(d) 
if d < Oorifd > 0 and there is a unit in K of norm —1; but 
h(d) = 2h,(d) 


if d > Oand there is no unit of norm — 1. 

There is a similar one-to-one correspondence between the 
automorphs of the fields, when d > 0, and the units in K of norm +1. 
If ¢, denotes the fundamental unit of K, thene = ¢, if Ne, = +1, but 
é =e} if Ne, = —1. 

Combining these results, we have in both cases 


h(d) log e = 2h,(d) loge, for d > 0. 


Thus the final expressions for the class number h,(d) of a quadratic 
field become 


w fd 
= — —— — if d 
ne xa," ete 
1 d 
h,(d) loge, = ae —]| log sin — ifd>0 
m=1 
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Dirichlet’s class-number formula, as given in (15) and (16), can 
be regarded as a special case of a theorem’® that applies to any 
algebraic number field K, by which the product of the class number 
and the regulator is expressed in terms of the residue at s = | of the 
Dedekind ¢ function of K. If K is a quadratic field, the Dedekind ¢ 
function is simply ¢(s)L(s, y), and the residue is L(1,y). But this 
special case is of interest in its own right, particularly in view of the 
fact that L(1, x) can be expressed by a finite sum, as in (17) and (18). 


1° Hecke, Algebraische Zahlen, Satz 125. 
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THE DISTRIBUTION OF 
THE PRIMES 


Legendre was the first, as far as we know, to make any significant 
conjecture about the distribution of the primes. Let 2(x) denote the 
number of primes not exceeding x. Then Legendre conjectured, 
somewhat tentatively, that for large x the number x(x) is given 
approximately by 

x 


log x — 1.08... 


This would presumably imply, at the very least, that the ratio of 
n(x) to x/log x tends to 1 as x — o0; and this is the celebrated Prime 
Number Theorem, which was first proved by Hadamard and de la 
Vallée Poussin independently in 1896. If we construe the-conjecture 
in the more precise form that 


x 


aS log x — A(x)’ 


where A(x) — 1.08... as x — oo, then it is erroneous, since (as we shall 
see) the limit of A(x) is 1. 

Gauss, in a letter of 1849 (which, however, was not published until 
much later), related that as a boy he had thought much on this 
question, and had reached the conclusion that a good approximation 
to (x) was given by 


He certainly believed that the ratio 2(x)/lix has the limit 1, which 
again is equivalent to the prime number theorem; how much more 
he believed is uncertain.' The asymptotic expansion of li x, found by 
integrating by parts several times, is 


x I !x q'x 


(log xy L! + &x)] 


ee eee cc ee 
= lonx.” (osx * Gog x 


See Landau. Handbuch. Kap. 1. 
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for any fixed g, where e(x) > 0 as x > oo. If the second term of this 
is significant in the approximation to 2(x) by li x, as we now know 
that it is, the limit of Legendre’s A(x) is 1. 

The first mathematician of all time to prove any worthwhile results 
about the behavior of 2(x) as x ~ c0 was Tchebychev, in 1851 and 
1852. In his first paper he provided some measure of justification for 
Gauss’ conjectural association of 2(x) with li x. He proved that 

tim <1 < tim 
— lx 


1X 


so that if the limit exists it must be 1. But he further proved (in effect) 
that if there is a function with an asymptotic expansion of the same 
general character as li x which gives a good approximation to z(x), 
then this function can only be li x itself. The proof is based on the 
asymptotic behavior of various combinations of ¢(s), ¢(s), ¢’(s),... as 
s > 1 from the right.” 

In his second paper Tchebychev gave definite inequalities for 
n(x): he proved that 


x 


x 
(1) (0.92..) oo. < nee) < (1:105..) 7555 


for all sufficiently large x. 
The proof depends on an interesting identity satisfied by the 
arithmetical function A(n), which is defined by 
lo if nis a power of a prime p, 
(2) Miat ag 
0 otherwise. 


The identity states that 


(3) ), A(m) = log n. 

mn 
Although this can be proved directly, the simplest way of deriving 
this and similar identities is by comparing coefficients in two 
Dirichlet series which have the same sum. By logarithmic differ- 
entiation of Euler’s identity, 


TP oe) (oa) 
(=> Y dog pp-™ = Y Ans. 
= 


C(s) pm=1 n 


us 


(4) 


2 See Landau, Handbuch. Kap. 10. 
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Multiplying this by ¢(s), we get 


P A(n)n~§ » n*| =—- il) = } (log n)n™*, 


for s > 1,and comparison of coefficients gives (3). 
If we sum (8) over positive integers n < x, we obtain 


msx nsx 


T(x) = > A(m) B = Y logn = log[x]!, 


and the number on the right is x log x — x + O(log x) by Stirling’s 
formula. This was the basis for Tchebychev’s proof of (1). 

A result of the same general character as (1), but with less precise 
constants, can be proved by considering the combination 


T(x) — 2TGx) = non( | = a 


msx m 2m 


The left side is asymptotic to x log 2, and the right side is 


l 
< Am) = py (log pee "| < (log x)a(x). 
This yields a lower bound for 2(x) of the desired character. The right 
side above is also 
> Y A(m> YY logp = (log $x)[x(x) — n(}x)]. 
4x<m<x jx<p<x 

This gives an upper bound for 2(x) — (4x), from which an upper 
bound for z(x) is easily derived by an inductive argument. Tcheby- 
chev’s proof of (1) was based on the consideration of the more 
elaborate combination? 


T(x) — Tx) — Tx) — TEx) + Th36%). 


The next substantial progress was made by Mertens in 1874. He 
proved that 


(5) », Bas loglogx + A + O[(log x)~ *], 


psx 


a result which (even in a less precise form) had been attempted by 
Tchebychev without success. The proof, as one now sees it, is not 
particularly difficult. We have seen that 


y nim] = T(x) = xlogx + O(x). 


msx 


3 Landau, Handbuch, Kap. 5. 
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The contribution of the prime values of m is 


> (log P|] =x See P + O(log x)n(x)] 


DEX psx 
- 
=x} mee + O(x). 
psx 
Other values of m contribute O(x). Hence 
y log p 


psx 


= log x + O(1). 


Denoting the sum on the left by s(x), we have 


ee ~= )' [s(n) 


p<xP 2<n<x 


and on applying partial summation we obtain (5). 

Another result of Mertens is of interest in connection with 
Dirichlet’s work on primes in an arithmetic progression. If y is any 
nonprincipal character (mod q), it follows from the results of §4 
that 

y 2) 

p P 
has a finite limit as s > 1 from the right ; for the amount by which this 
series differs from log L(s, x) is trivial. Mertens proved the deeper 
result, which is suggested by the preceding one but cannot be 
deduced directly from it, that 


x(P) 
(6) Wars 
p P 
converges. From this and (5) he easily deduced, by taking a linear 
combination of characters in the usual way, that 


1 I 
— = —~ log logx + A(q,a) + O[(log x)7 '}. 
so Rg (4) 
p=a(modq 


We have here a more precise form of Dirichlet’s theorem that the 
series on the left, when extended to infinity, is divergent. 

The proof of the convergence of the series (6) is simple and 
ingenious. Using (3), we have 


x(n)logn _ x(m, )x(m2)A(m) 
2 n < oes m,;m), 


5 mst) sams) 


m, Sx m, m2 <x/m, mM), 


nsx 
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The inner sum on the right differs from L(1, x) by a remainder that is 
O(m,/x), by partial summation. Hence the last expression is 


rs 5 Amat) m 


msx 


x! >} A(m)}. 


msx 


The last error term is O[x~ ‘(log x)x(x)] = O(1), by (1). Since the 
series Ly(n)(log n)/n is convergent, by Dirichlet’s test, it follows that 


xim\Nom) 4) 
m 


and from this the convergence of the series (6) is deduced by partial 


summation. 

It may be of interest to observe that the convergence of the series 
(6) implies the convergence of the Euler product for L(s, y) when 
s = 1. Hence 


L(t, 4) = MA[t — x(p)p')'. 
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RIEMANN’S MEMOIR 


In his epoch-making memoir of 1860 (his only paper on the theory 
of numbers) Riemann showed that the key to the deeper investigation 
of the distribution of the primes lies in the study of {(s) as a function 
of the complex variable s. More than 30 years were to elapse, how- 
ever, before any of Riemann’s conjectures were proved, or any 
specific results about primes were established on the lines which 
he had indicated. 

Riemann proved two main results: 

(a) The function ¢(s) can be continued analytically over the whole 
plane and is then meromorphic, its only pole being a simple pole 
at s= 1 with residue 1. In other words, f(s) —(s — 1)"' is an 
integral function. 

(b) f(s) satisfies the functional equation 


nm *P(z8)C(s) = mA OT LAL — s)]C(1 — 3), 


which can be expressed by saying that the function on the left is an 
even function of s — 4. The functional equation allows the properties 
of ¢(s) for ¢ < 0 to be inferred from its properties for o > 1. In 
particular, the only zeros of C(s) for ¢ < 0 are at the poles of I'(5s), 
that is, at the points s = —2, —4, —6..... These are called the trivial 
zeros. The remainder of the plane, where 0 < o < |, is called the 
critical strip. 

Riemann further made a number of remarkable conjectures. 

(a') C(s) has infinitely many Zeros in the critical strip. These will 
necessarily be placed symmetrically with respect to the real axis, 
and also with respect to the central line ¢ = } (the latter because of 
the functional equation). 

(b’) The number N(T) of zeros of C(s) in the critical strip with 
0 <t < T satisfies the asymptotic relation 


T T 
(1) N(T) = 7, 857 — 


T 
— + O(log T). 
2n 


2n 
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This was proved by von Mangoldt, first in 1895 with a slightly less 
good error term and then fully in 1905. We shall come to the proof 
in §15. 

(c’) The integral function €(s) defined by 


E(s) = as(s — 1m” TGs) C(s) 


(integral because it has no pole for o > 4 and is an even function of 
s — 4) has the product representation 


(2) €(s) a eee Ol | 


p 


S 
1 — — |e, 
| 


where A and B are constants and p runs through the zeros of ¢(s) 
in the critical strip. This was proved by Hadamard in 1893, as also 
was (a’) above. It played an important part in the proofs of the prime 
number theorem by Hadamard and de la Vallee Poussin. We shall 
come to the proof in §§ 11 and 12. 

(d’) There is an explicit formula for z(x) — lix, valid for x > 1, 
the most important part of which consists of a sum over the complex 
zeros p of ¢(s). As this is somewhat complicated to state, we give 
instead the closely related but somewhat simpler formula for 
W(x) — x, where 


(3) W(x) = >) A(n). 
It is: 

Sp oO) aise aye 
(4) W(x) — x = yr, 70) 2 log(1 — x~*). 


This was proved by von Mangoldt in 1895 (as was Riemann’s original 
formula), and we give the proof in §17. In interpreting (4) two con- 
ventions have to be observed: first, in the sum over p the terms p and 
p are to be taken together, and second, if x is an integer, the last term 
A(x) in the sum (3) defining (x) is to be replaced by $A(x). 

(e') The famous Riemann Hypothesis, still undecided: that the 
zeros of C(s) in the critical strip all lie on the central line o = 4. It was 
proved by Hardy in 1914 that infinitely many of the zeros lie on the 
line, and by A. Selberg in 1942 that a positive proportion at least of 
all the zeros lie on the line. 

There is very little indication of how Riemann was led to some of 
these conjectures. In 1932 Siegel’ published an asymptotic expan- 


 Quellen und Studien zur Geschichte der Mathematik. 2, 45-80 (1932). 
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sion for ¢(s), valid in the critical strip, which had its origin in notes 
of Riemann preserved in the Gottingen University Library. From 
Siegel’s description of the notes, it is plain that Riemann had more 
knowledge about C(s) than is apparent from his published memoir ; 
but there is no reason to think that he had proofs of any of his 
conjectures. 

In the present section we shall prove what Riemann proved, that 
is (in effect) the functional equation, and we shall follow one of his 
two methods. Many other proofs have since been given,” but this 
one is still the most elegant. 

Riemann started from the classical definition of the I function: 


fee) 
ris) =[" ete at, 
0 
valid for o > 0. Putting t = n*xx, we get 
foe) 
nT (Ss)n-* =[ xts~ te mnx gy 
0 


Hence, for a > 1, 


@ 


n~ *T(4s)C(s) =[" xis- (> ci dx, 


1 


the inversion of order being justified by the convergence of 

@ ce) =$ = 

yy) xt? lo nex dy, 

1 °° 

Writing 
ie.0) 
ox) = Yes, 

1 


we have 
n~ #7 (4s)C(s) = ie x?5~ lay(x) dx 
= fk x?5~ lay(x) dx + fi x7 #57 lay(1/x) dx. 
Plainly 


2a@(x) = A(x) — 1, 


? See Titchmarsh, Chap. 2. 
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where 


(5) x)= Yevr™ 
This function satisfies the simple functional equation 
(6) O(x~!) = x?O(x) for x > 0, 


as we shall prove below; this equation is a special case of those 
satisfied by the 3 functions of Jacobi. It follows that 


a(x!) = —4 + 4x? + x*e(x). 
Hence 


| x7 #8 lax!) dx -| xt" 1 — 5 + 5x? + xt w(x)] dx 
1 1 


1 1 eer 
=a +{ x” #5 4@y(x) dx. 
> 1 


We have therefore proved that 


(7) 27 *T(4s)f(s) = +| (x38! + x7 457 2)e(x) dx. 


1 
s(s — 1) iy 


This holds for ¢ > 1. But the integral on the right converges 
absolutely for any s, and converges uniformly with respect tos in any 
bounded part of the plane, since 


a(x) = O(e"**) 


as x > +00. Hence the integral represents an everywhere regular 
function of s, and the above formula gives the analytic continuation 
of C(s) over the whole plane. It also gives the functional equation, 
since the right side is unchanged when s is replaced by 1 — s. 

We note that the function 


&(s) = 3s(s — 1) 1 Gs)k(s) 


is regular everywhere. Since 4sI (3s) has no zeros, the only possible 
pole of C(s) is at s = 1, and we have already seen (p. 32) that this is 
in fact a simple pole with residue 1. 

Since ($s) ~ ($s)~ ‘ as s > 0, we deduce from (7) that ¢(0) = —4. 
It is easily verified that 


for x > 1, 


bole 
Pad 
Ne 


(x) = e + e 4e* a e7 9" er ete ce 
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so if 0 <s < 1 the integral in (7) is less than {s(1 — s)}~'. Hence 
¢(s) < 0 for 0 < s < 1. [The same conclusion may be drawn, more 
simply, from (7) of §4.] 

It remains to prove the functional equation (6) of the @ function. 
We shall prove this in the more general form 


(8) y ent a)en/x — xt 3 enx + 2nina 


— — @© 


which reduces to (6) when « = 0, since we shall need this in the next 
section. It is supposed in (8) that x > 0 and that « is any real number 
(though actually the equation holds for complex x and «, provided 
Rx > 0, with the value of x~? which has argument between —1x 
and jn). 

By Poisson’s summation formula (§2), 


N’ @ N 
y e (nt a)?n/x = Y | et ta)?n/x + 2nivt dt. 
n=-N v=-a~—N 


Here we can replace N by o, since 


00 foo) 
a I 
| e~"+2n/x cos Inve dt = ——]| sin 2nvt dle~"+#"*] 
anv), 


N 


by integration by parts, and therefore 


-—(N+ 2 
Ce Terns: 


(ce) 
| e @+aPn/x Cog Invt dt 


v#0 N 


where C is a constant. Since this disappears as N — oo, the limit 
Operation is justified. Thus 


a0 roe = 
py e7 (nt a)rn/x ze y [ ett a)?n/x + 2nivt dt 
— 0 v=-a@* @ 
ee ro) 
=x 5. ae e7 Rxu? + 2nivxu du. 
v=— 0 ; wee 


The quadratic in the exponent is 
—nx(u — iv)? — mxv?. 


Now 


Pn mxtu + B)? Oj axe? - 
{ e oe du =[ e™"" dy = Ax}, 


-o —@ 
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where A is a positive constant ; this holds for any f (real or complex) 
and simply expresses a movement in the path of integration from the 
real axis to another line parallel to it. Hence 


x et a)?n/x = Ax? : e txv?— 2niva. 
If we now take « = 0 and apply this formula twice, we get A? = 1, 
whence A = 1. This proves (8), on replacing v by — v on the right. 
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THE FUNCTIONAL EQUATION 
OF THE L FUNCTIONS 


The functional equation for Dirichlet’s L functions was first 
given by Hurwitz in 1882 (Werke I, pp. 72-88), though he confined 
himself to real characters since he was primarily interested in 
L functions in relation to quadratic forms. He first obtained the 
functional equation for the more general ¢ function C(s, w), which 
will be given below, and deduced that of the Lfunctions from it. We 
shall follow the method used by de la Vallée Poussin in 1896, which 
is an extension of that of Riemann used in the preceding section. 

The functional equation is valid only for primitive characters. 
We need the expression for y(n) as a linear combination of imaginary 
exponentials e,(mn), which we used earlier in §1 [(4) and (5)] in the 
case when the character is the Legendre symbol. 

For any character y(n) to the modulus g, the Gaussian sum 
t(x) is defined by 


(1) u(x) = ) x(m)e,(m) 
m=1 


If (n, q) = 1, then 


xirve@) = ¥ xlommpegton) 
(2) =yi x(h)e, (nh), 


on putting m = nh(mod q). This gives the desired expression for 
y(n), provided that (n, g) = 1 and that t(y) # 0. 

We now prove that, if y is a primitive character, the last relation 
holds also when (n, q) > 1. We put 


n ny 


q 
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where (n,,q,) = 1 and q,|q,q, <q. We can suppose that q, > 1, 
since the relation holds trivially if n is a multiple of g. We have to 
prove that 


q 
yu x(h)e(n,h/q,) = 0. 
h= 


Write g = q:q, and put h = uq,; + v, where 
0 <u < q, l<vu<q,. 


Then the exponential depends only on u, and it will suffice to prove 
that 


42-1 


Y X(uq, + v) = 0 


u=0 


for every v. Considered as a function of v, the last sum, say S(v), is 
periodic with period q,, for the effect of replacing v by v + q, is to 
change the range for u into 1 < u < q, and u = q, is equivalent to 
u = 0. If cis any number satisfying 


(3) (c,qg)=1, c=1(modq), 


then 


q2—1t 


q2~1 
(4) x(c)Sv) = Yo xX(cuq, + cv) = Yo H(uqy + cv) = S(v). 
u=0 =0 


We now appeal to the characteristic property of primitive 
characters (§5), namely that for (n, q) = 1, the function y(n) is not 
periodic to any modulus q, that is a proper factor of g. This implies 
that there exist integers c,, c, such that 


(C1, 9) = (C2, 4) = 1, Cy, = C2 (mod q;), Uc) F x(C2). 


Hence there exists c = c,cz' which satisfies (3) and has y(c) # 1. 
It follows from (4) that S(v) = 0 for any v, as was to be proved. 

We have proved that (2) holds independently of whether (n, g) = 1 
or not. We now prove that, for a primitive character y, 


(5) Ic(x)| = q?. 


The proof given in §3 for a cubic character to a prime modulus 
applies equally to any nonprincipal (and therefore primitive) charac- 


THE FUNCTIONAL EQUATION OF THE L FUNCTIONS 67 


ter to a prime modulus, but does not readily extend to a composite 
modulus. The simplest proof is an indirect one. By (2), 


q q 
Mie? = & be xh, )x(ha)edn(h, — h2))- 
hy=1 ho= 


Now sum for n over a complete set of residues (mod q). The sum of 
the values of |y(n)|? is d(q), and the sum of the exponentials is 0 unless 
h, = h,. Hence 


bit? = 4 Y Ix)? = 4404), 
h 
giving (5). 
Although it is not necessary for our purpose, it may be of interest 


to evaluate 1(y) for a nonprimitive y in terms of t(y,), where y, is 
the primitive character (mod q,) that induces y. We have 


q q 
x)= Y xme(m/q)= YY yx,(me(m/q). 
m=t1 m=1 


(m,q) = 1 


Put q = q,r. We first prove that t(y) = Oifq, and rare not relatively 
prime. Put D = (q,,1r); then the values of m that occur in the sum 
can be expressed as 


m =m, + tq,r/D, 
where 
(m,,q) = 1, 0<m, <4q,r/D, O0<t<D. 


But then x,(m) = x,(m,), since q,r/D is an integral multiple of q,. 
Hence the sum for (x) contains, as a factor, the sum 


D 
y: e(t/D), 


t=1 


and this is 0 since D > 1. Thus it remains only to consider the 
case in which (q,,r) = 1. Here we can put 


m= uq, + 0r, where 0<u<r, O<v<q,. 


This gives 


u=1 v=1 
(u,r)=1 (v.q,)=1 


= p(r)x\(r)t(y,). 
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Wecan now rewrite (2) in the form 
] q 
(6) x(n) = @ du x(m)e(mn/q). 


The functional equation of an L function takes different forms 
according as y(— 1) = 1 or y(— 1) = —1. One of these must hold, 
since y(— 1)? = x(1) = 1. 

Suppose that y(— 1) = 1. We have 


CO 
= = —p2 ae 
nm *qg*T(4s\n 5 = | er aga ie: 
0 


and on multiplying by y(n) and summing over n we get 
(7) qT s)L(s, x) = [2d i > nye dx, 
0 n=1 


fora > 1. Since y(— 1) = 1 and y(0) = 0, we can write this as 


| x*5~ W(x, x) dx, 
2 0 


where 
W(x, x) a Y y(nye~ 7/4, 
A functional equation that relates W(x, 7) to W(x~', x) can be 


deduced from (6) and the functional equation (8) of §8, with x re- 
placed by x/q. We have 


u(X)W(x, x) 


n=— © 


q oo Fe 
> ae 2ximn/q 
Yy X(m) Y ee" nx/q+ 
m=1 


> ¥(m)(q/x)* : e 7 (n+ miq)?nq/x 
m=1 


n=— 0 


(q/x)* 3 x(m) y e” (nqt m)?x/xq 
m=1 aces 


= (q/x)}? Y xe Pr 


l=-o@ 


= (q/x)PWx~*, Z). 
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Now we split the integral in (7) into two parts, as in §8, and obtain 


nm **q*T(3s)L{s, x) 


= | x?5~ W(x, y) dx + Al x #8 W(x7!, y) dx 


1 


00 q? 
=5 x HY (x, pax + 5a x AY (x, 7) dx. 


This expression represents an everywhere regular function of s, 
and therefore gives the analytic continuation of L(s, vy) over the 
whole plane, regular everywhere since I'(4s) is never 0. Moreover, 
if we replace s by | — s and x by ¥, the above expression becomes 


oO 


1g een Ly ts tuyx. 7 
5 a), x?5~ "W(x, x) dx + 5 | 3 W(x, x) dx, 


which is equal to the previous expression multiplied by q?/t(y), 
since 


U(x)e(x) = 4. 


The last relation is a consequence of (5) and y(—1) = 1, since the 
latter implies that 1(y) = 1(7). 

We have now obtained the functional equation for L(s, y) in the 
form 


n2-9gA0-9TTL — S\IL — 5,%) 


4 
(8) = 414-49 TUs)L(s, 9), 
t(X) 


and this is valid for any primitive character y to the modulus q 
for which y(—1) = 1. Since L(1 — s, ~) has no zeros for 1 — a > 1, 
that is, for o < 0, and I'[4(1 — s)| has no zeros at all, the only zeros 
of L(s, y) fora < Oareats = —2, —4, —6,..., corresponding to the 
poles of I'(4s). There is also a zero of L(s, y) at s = 0, corresponding 
to the pole of I'(4s) there. 

Suppose that y(—1) = —1. The previous argument fails, since 
now the function (x, y) simply vanishes. We modify the procedure 
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by writing 3(s + 1) in place of 4s in the original formula, so that this 
becomes 


m7 3st gist DTT AUs 4: 1)|n-5 = | ne~™*x/dyts— 4 dx, 
0 
and gives 
1 ice) 
nH DE OTLNS + ILS, 2) = 5 | Wile, x3 dx, 
0 


where 


foe) 


Wilx, = Y ny(nje~"**/9, 
The functional equation satisfied by w,(x, x), analogous to that 
satisfied by w(x, x), Is 


(9) (XW 1(x, x) = iq*x” #y (x ts x) 


and this is proved by the same reasoning as before, but with an 
appeal to the relation 


(10) 2 ne7™x/a+ 2nimn/q — i(q/x)* 3 (n + m/ qe "nt miaralx, 


The latter is deduced from (8) of §8 as follows. We have 


00 roe) 
ys eM nyt 2nina = y ? y ent a)nly 


-o — 0 


Differentiation with respect to a, justified by the uniform conver- 
gence of the differentiated series, gives 


ro) 
ari y ne~ ny + 2nina Seis —2ny-? y (n a ae tainly, 
- 2 


— 0 


and, on replacing y by x/q and « by m/q, we get (10). 
Using (9) in the integral above, as in the preceding case, we obtain 


mist gist OT [Hs + 1)|L(s, x) 


-3{ Wilx, y)x3s- ee =| w(x, xx #5 dx. 


2 t(x) 


This again gives the continuation of L(s, y) as a regular function over 
the whole plane. If we replace s by 1 — s and x by x, the expression 
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becomes equal to its previous value multiplied by iq*/t(x), since 
now 


u(x)t(X%) = —4. 


Thus the functional equation in the present case takes the form 
(11) n¥2-NgH2- ITH — s\]L(I — 5, Z) 


at 
= TA us+ Ygte* DT[Hs + 1 IL(s, x). 
tx) 


The zeros of L(s, y) for o < 0 are now at the poles of I'[3(s + 1), 
that is, ats = —1, —3, —4S..... 

It is possible to put together the two forms of the functional 
equation in (8) and (11) by introducing a number a, depending on 
y, defined by 


(12) 


0 ify—1=1, 
a= 
1 if y(—1) = 1. 


Then the functional equation takes the form: if 


(13) E(s, x) = (n/q) TMs + a)]L(s, 2), 
then 

ca + 
(14) El — 5,7) = “As, y). 

tx) 


Another method of proof, as mentioned at the beginning of this 
section, is to relate L(s, x) to the function C(s, «), which is defined 
for0 <a <1 by 


(15) C(s, a) = y (n+ a)°*. 
n=0 


This reduces to C(s) when a = 1 and to (2° — 1)¢(s) when a = 4. 
The relationship follows at once from the periodicity of y(n): We 
have 


(16) Lis,y) = Y xm) ¥ (qn +m) 
m=1 n=0 


=q° Z x(m)C(s, m/q). 


m=1 


The function C(s, «), like ¢(s), can be continued to be regular every- 
where except for a simple pole at s = 1. For o < 0, it is expressible 
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in terms of two other convergent Dirichlet series, with 1 — s in 
place of s, by the formula! 


1 = ioe) cee) ; 
C(s, a) = ‘Gah sin 47s a + cos 42s ee ; 
The use of this relation in (16) leads again to the functional equation 

for L(s,y), though in an unsymmetric form. 

There is nothing corresponding to a Euler product for C(s, «), 
except when « = | or 3, and it behaves in many ways quite dif- 
ferently from ¢(s). Heilbronn and I proved? that, if « is rational 
(#1 or 4) or transcendental, then ¢(s,a) has zeros in o > 1, and 
Cassels? proved the same in the more difficult case when a is an 
algebraic irrational. 


' See, for example, Titchmarsh, §2.17. 
? J. London Math. Soc., 11, 181-185 (1936). 
3 J. London Math. Soc., 36, 177-184 (1961). 
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PROPERTIES OF THE I 
FUNCTION 


We collect some properties of the I function for convenience 
of reference.! The usual definition is by means of Euler’s integral: 


(1) Ts) = fe en's! dr, 
0 
but this applies only for ¢ > 0. Weierstrass’ formula 


=e |] (1 + s/nje7*", 


n=1 


(2) sI(s) 


where y is Euler’s constant, applies in the whole plane, and shows 
that I'(s) has no zeros and has simple poles at s = 0, —1, —2..... 


Among the functional relations satisfied by I(s) are 
3) I(s + 1) = sI(s), 
I(s)'(1 — s) = n/sinzs, T(s)(s + 4) = 21” 75n?T(2s), 


the last being Legendre’s duplication formula. Combined, they 
give 

T'(s)/T(4 — 3s) = 27> 4215 cos 4szI(s), 
and if this is used in the functional equation of C(s) (p. 59), it gives 
the unsymmetric form of the functional equation: 


(4) C(1 — s) = 2' Sx (cos $s2)I(s)C(s). 

Stirling’s asymptotic formula, in the simple form 
(5) log I'(s) = (s — 4)logs — s + 4log 2x + O(|s|~'), 
is valid as |s| > 00, in the angle —2 + 6 < args < na — 6, for any 
fixed 6 > 0. Under the same conditions, 
I(s) 
T(s) 


(6) = logs + O(|s|~'). 


' Proofs will be found in many books, e.g., in Whittaker and Watson, Modern 
Analysis, Chaps. 12 and 13. See also Ingham, footnote on p. 57, with reference to (6). 
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INTEGRAL FUNCTIONS OF 
ORDER 1 


The next important progress in the theory of the ¢ function, 
after Riemann’s pioneering paper, was made by Hadamard, who 
developed the theory of integral functions of finite order in the early 
1890’s and applied it to ¢(s) via &(s). His results were used in both 
the proofs of the prime number theorem, given by himself and by 
de la Vallée Poussin, though later it was found that for the parti- 
cular purpose of proving the prime number theorem, they could 
be dispensed with. 

An integral function f(z) is said to be of finite order if there exists 
a number « such that 


(1) f(z) = O(e"*) as |z| > &. 


We must have « > 0, excluding the case when f(z) is just a constant. 
The lower bound of the numbers « with the property (1) 1s called the 
order of f(z). 

An integral function of finite order with no zeros is necessarily 
of the form e®, where g(z) is a polynomial, and its order is simply 
the degree of g(z) and so is an integer. For g(z) = log f(z) can be 
defined so as to be single valued, and is itself an integral function. 
It satisfies 


Rg(z) = log| f(z)| < 2R* 
on any large circle |z| = R. If we put 
g(z) = > (a, + ib,)z” 
0 
then 
Rge(z) = }a,R"cosnO — > b,R" sin nO, 
0 i 
for z = Re’®. If we assume (0) = 0, as we may, then 
2n : 2.1 ; , 
nja,|R" < | ,, (Re(Re)dd = } {\Rg(Re®) + Re(Re®)}d0 < &nR*. 
0 
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It follows, on making R > 00, that a, = Oifn > a, and similarly for 
b,. This proves that g(z) is a polynomial, and it is then obvious that 
the order of f(z) is equal to the degree of g(z). 

We observe, for future reference, that in the preceding argument 
it suffices if the estimate for f(z) on |z| = R holds for some sequence 
of values of R with limit infinity, instead of for all large R. 

Now suppose that an integral function f(z) of finite order p has 
zeros at Z,,2Z,... (multiple zeros being repeated as appropriate). 
The question arises: How is the distribution of the zeros related 
to the order p? This question is most easily answered by means of 
Jensen’s formula!: if z,,..., z, are the zeros of f(z) in |z| < R, and 
there is no zero on |z| = R, then 


i ; R" 
2) 5] logif(Re| d0 — log! f(0)| = log ——_. 
Ye 0 


IZ 11-12 


{We suppose, for convenience, that f(0) 4 0.] An alternative 
expression for the right side is 


R 
i) r~‘'n(r) dr, 
0 


where n(r) denotes the number of zeros in |z| < r. For if |z,] = rj, 
and so on, the value of the integral is 


logr,/r,; + 2logr3/r. + -- + nlog R/r, = log(R"/r jr. ... ry). 
Jensen’s formula is easily established by factorizing f(z) as 
(2 — 24).(z — z,)F(2) 


and proving the formula for each factor separately. 
It follows from Jensen’s formula that the zeros of an integral 
function of given order p cannot be too dense. For if « > p, we have 


log| f(Re’®)| < R’ 
for all sufficiently large R, whence 
R 
I r—'n(r) dr < R* — log| f(0)| < 2R*. 
Since 
2R 2R 
i r~‘n(r) dr > n(R){ r—! dr = n(R)(log 2), 
R R 


' Strangely enough, Jensen’s formula was not discovered until after the work of 
Hadamard. 
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it follows that 


(3) n(R) = O(R’). 


A consequence of this estimate is that )’r,* converges if B > a, 
and therefore converges if B > p. For 


[e.9) 


Yr f= [ r-? dn{(r) = Bi r ®-ln(r) dr < 0. 


1 


We are now in a position to represent f(z) by a simple canonical 
product, of the kind introduced by Weierstrass. From now on we 
suppose that p = 1, since this is the only case with which we shall be 
concerned later. We can then assert that Zr, '~* converges for any 
é > 0, and in particular that Zr; ? converges. Hence the product 


P(z) = I (1 — z/z,)e7/7" 
n=1 


(if it does not terminate) converges absolutely for all z, and con- 
verges uniformly in any bounded domain not containing any of 
the points z,. Hence it represents an integral function with zeros 
(of the appropriate multiplicities) at z,, z3,.... If we put 


(4) f(z) = P(z)F(2), 


then F(z) is an integral function without zeros. 

We cannot immediately conclude that F(z) = e*”, where g(z) is 
a polynomial, because it is not obvious that F(z) is of finite order. 
The most direct way of proving the desired result is to obtain a 
lower bound for |P(z)|, and hence an upper bound for |F(z)|, on a 
sequence of circles |z| = R, and then appeal to the result proved 
earlier. The values of R must be kept away from the numbers r,. 
Since Zr, * converges, the total length of all the intervals (r, — r, ’, 
r, + 7, 7) on the real line is finite, and consequently there exist 
arbitrarily large values of R with the property that 


(5) IR—r,| >r,? for all n. 


Put P(z) = P,(z)P,(z)P3(z), where these are the subproducts 
extended over the following sets of n: 
P,: IZ, < 3R, 
P,: 3R < |z,| < 2R, 
Py \z,{ > 2R. 
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For the factors of P, we have, on |z| = R, 

I — 2/z,)e*!"| > (\z/zq] — Le! al > e= Rie, 
and since 

Yot<ary Ft 
tn<4R n=1 
it follows that 
|P,(z)| > exp(— R'***). 
For the factors of P,, we have 
(1 — z/z,)e?/?"| > e~*|z — z,|/2R > CR73, 


where C is a positive constant, by (5). The number of factors is less 
than R**®, by (3). Hence 


|P,(z)| > (CR~*)*'"* > exp(—R'**), 

Finally, for the factors of P;, we have 

(1 — z/z,)e2/?>| > e7tRIrn)? 
for some positive constant c, since |z/z,| < 3. We also have 

Smt < ery y res 

tm>2R n=1 

and therefore 

|P3(z)| > exp(— R****). 
It follows that, on |z| = R, we have 

|P(z)| > exp(— R'***), 
whence 

[F(z)| < exp(R****) 


by (1) and (4). By what was proved earlier, this inequality, since it 
holds for a sequence of values of R with limit infinity, implies that 
F(z) = e®), where g(z) is a polynomial of degree at most 1. Finally 
we have, therefore, 


fe 2) 


(6) F(z) = e4*™ T] (1 — z/z,)e7*", 


n=1 


where A and B are constants. 
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We know that the series Zr, !~* converges for any ¢ > 0. The 
series Zr, ! may or may not converge, but if it does then f(z) satisfies 
the inequality 


(7) f(z) < eC! 


for some constant C. This follows at once from the inequality (valid 
for all ¢) 


(1 — Cel < e7#l, 


which itself follows from the power series for (1 — C)e°. 

To summarize the results of the present section: 

An integral function of order 1 necessarily has the form (6). If 
r, = |Z,|, where the z, are the zeros of f(z), then Xr, '~* converges for 
any é > 0. If Zr, | converges, then f(z) satisfies (7). 
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THE INFINITE PRODUCTS FOR 
&(s) AND &(s.x) 


We apply the conclusions of the preceding section to the integral 
function 


(1) E(s) = 25(s — 1)n7 *T(48)C(s). 
We first prove that 
(2) l&(s)| < exp(Cls|logls|) as |s| + a, 


for some constant C; this will establish that ¢(s)1s of order 1 at most. 
Since €(s) = €(1 — s), it will suffice to prove the inequality when 
o > 4. Obviously! 


[5s(s — 1)n7*| < exp(Clsl), 
and 
ITs) < exp(Cls| logls|) 


by Stirling’s formula, which is applicable since —}z < args < $n. 
Thus it remains to estimate ¢(s), and this is possible on the basis of 
the representation obtained in (7) of §4, namely 


C(s) = i a — [x])x~°" ' dx, 


s—1l 
valid for ¢ > 0. The integral is bounded for o > 4, and therefore 
(3) If(s)| < Cls| 


when |s| is large. This completes the proof of (2). 

We see further that, ass + + oo through real values, the inequality 
(2) is substantially (that is, apart from the value of C) the best pos- 
sible, since log I'(s) ~ slog s and C(s) + 1. Consequently &(s) does 
not satisfy the more precise inequality (7) of the preceding section. 


' The constant C is not necessarily the same at each occurrence. 
79 
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It follows that &(s) has an infinity of zeros, say p,, P2,.... such that 


(4) > |p, '~’ converges for any e > 0 
and 
(5) >} lal? diverges; 
and that 
(6) es) = eT (1 — s/p)e”. 
p 


The zeros of &(s) are the nontrivial zeros of C(s), for in (1) the trivial 
zeros of ¢(s) are cancelled by the poles of I'(4s), and $sI'($s) has no 
zeros, and the zero of s — 1 is cancelled by the pole of ¢(s). Hence ¢(s) 
has an infinity of nontrivial zeros p in the critical strip0 < o < 1,and 
these have the properties (4) and (5). 

The product formula (6) leads to an expression for ¢'(s)/¢(s) as a 
sum of partial fractions. Logarithmic differentiation of (6) gives 


o'(s) I 1 
_ 2 =B = 
” c(s) DN rae 


> 


and, combined with the logarithmic derivative of (1), this gives 


MS) _ p 1 4 afi Gut) Jt vl 
OG sat Ae Teen th (55) 


This exhibits the pole of ¢(s) at s = 1 and the nontrivial zeros at 
s =p. The trivial zeros at s = —2, —4,... are contained in the 
I term, since 


1V@s + 1) . 


1 
9 cE as 
0) “OTGse iy ue Feo = 


by logarithmic differentiation from (2) of §10. The representation 
of ¢’/¢ in (8) will be the basis for much of the later work on ((s). 

The constants A and B, though not very important, can be eval- 
uated. By (1), 


(1) = 9m *T(2) lim (s ~ 1)¢(s) = 2 


whence ¢(0) = 4 and therefore e4 = 4 by (6). 
As regards B, we have 
_ c) g(t) 
EO) I) 
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from (7) and the functional equation €(s) = €(1 — s). By (1), 


G(s) o(s) 1 : 1T(ds + 1) 
og — 


Fig. Gy 2 T(ds + 1)’ 


It follows from (9) and the series for log 2 that 


_1T@_, 
27T@) 7 


y — 1 + log 2. 


Hence 


Eo 1 
ial gee 1jog 4n — | sds 
B=35y —1+4log4n lim fey i 


To evaluate the limit, we have recourse again to (7) of §4: 


C(s) = — — sI(s), I(s) -| (x — [x])x757! dx. 


A simple calculation shows that 


. | o(s) 1 eden! 
lim ie + a. | = 1 — (1). 


Now 


4 Not fp 
inlaid 


n=1 


= lim 


N-+ 0 


N 
logN-— ¥ nt ei)=1—y 
n=1 


Hence 
(10) B= —3y — 14 4 log 4z. 


We can give another interpretation for B, as follows. Although the 
series &|p|~1 diverges, the series Lp ' converges, provided one 
groups together the terms from p and p. For if p = B + iy, then 


1 2B 2 


She Se tae 
p Bp? +y?~ Ipl? 


1 
p 
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and we know that L|p|~? converges. It follows from (7) and the 
functional equation for &(s) that 


eee ee 


(2627 Pp “\s— pp 


and the terms containing 1 — s — p and s — p cancel, since if p 
is a zero then so is 1 — p. Thus 


I B 
ay BeBe 2D 
The numerical value of B is about —0.023; from this it can easily 
be seen that |y| > 6 for all zeros. 
We now apply similar considerations to the Lfunctions. Let x bea 
primitive character to the modulus q, and define, as in (13) of §9, 


(12) E(s, x) = (q/n)*8* #1 (Gs + Za)L(s, x), 


where ais Oor | as in (12) of §9. [Note that there is no need to include 
the factor s(s — 1), which was inserted in the definition of &(s) to 
cancel the poles of ($s) and ¢(s) at s = 0 and s = 1 respectively. ] 
As we saw in §9, &(s, y) is an integral function and satisfies the 


functional equation ‘ 
1° 


(13) el - 5, = ls, 0, 
1) 
in which the multiplying factor has absolute value 1. 

We need first an estimate for L(s, y) when |s| is large, and this is 
deduced on the same lines as for ¢(s), starting from (8) of §4. This 
states that 

L(s, x) = s{ S(x)x~'~* dx, where S(x = 2 x(n), 


and is valid for o > 0. Since |S(x)| < q, it implies that 


(14) IL(s, x1 < 2qls| for o > 3. 
Hence 
(15) IE(s, x)| < 2q??* 4Is| IT [Zs + a)]| 


< q??*? exp(C|s| log |sl) 


when |s| is large. A similar result holds for o < 4, by the functional 
equation. Again this inequality is substantially the best possible as 
s + +00 through real values, since then L(s, y) ~ 1. We conclude, 
as for ¢(s), that L(s, x) has an infinity of zeros p in the critical strip 
0 < o < 1, which have the properties (4) and (5). We also have 


(16) &(s, x) = e4* 5] (1 — s/p)e, 
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but A and B will now depend on x. One can express e4 = €(0, x) in 
terms of €(1, ¥) and therefore in terms of L(1, 7). 
The analog of (8), obtained by logarithmic differentiation from 


(16) and (12), is 
11s + 40) 1 


Els, 0 —llog 4 — ~———— + Bly) + >[— + i) 
p 


o L(s, x) n 21s + 3a) p p 
This, again, is the basis for much of the later work. 

The number B(y) can be expressed in terms of the expansion of 
L'/Lin powers of s, but it seems to be very difficult to estimate B(x) 
at all satisfactorily as a function of. (In subsequent arguments it will 
usually be eliminated from the above equation by subtraction.) If 
we argue as in the proof of (11), we get 


£00 _ 9) 
Wn 7 


1 4 
== By) = y |e se): 
m Sea 


As B(%) = B(), it follows that 


B(x) 


1 1 
2R BY) = - (2 —= +R ) 
2 1—p p 
We now write p in place of 1 — p; this is permissible since permu- 
tation of non-negative terms does not alter a sum. Hence 


1 1 1 1 


In particular, if y is a real character, B(y) is negative and is expressed 
in terms of the zeros p by (11). The difficulty of estimating B(y) is 
connected with the fact that, as far as we know, L(s, y) may have a 
zero near to s = 0. 

We observe that, for a complex y, the zeros of L(s, x) are still 
symmetric about the line o = 4, since 1 — p= p’, but not about 
the real axis. 
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A ZERO-FREE REGION FOR Cs) 


It was proved independently by Hadamard and de la Vallée 
Poussin in 1896 that ¢(s) # 0 ono = 1. This was a vital step in their 
proofs of the prime number theorem, and it remained a vital step in 
all subsequent proofs until the discovery of an elementary proof’ by 
Selberg and Erdds in 1948. 

For o > 1, we have 

log C(s) = > yom sail p te imber. 
pm=1 
If C(s) had a zero at 1 + it, then R log C(o + it) would tend to — oo 
as o — | from the right. This suggests that the numbers cos (tm log p) 
would be predominantly negative. But then we should expect the 
numbers cos (2tm log p) to be predominantly positive, and it seems 
likely that this would contradict the fact that Rlog {(o + 2it) 
remains bounded above as o > 1. 

The line of reasoning just indicated was worked out in rigorous 
detail by Hadamard and (somewhat differently) by de la Vallée 
Poussin. Mertens? put the proofina more elegant form by employing 
the inequality 
(1) 3 + 4cos@ + cos 20 > 0, 


which holds for all 6 because the left side is 2(1 + cos 0)”. Applied 
to 


R log C(s) = ¥ y m™~'p~™? cos(t log p™) 


pm=1 
with t replaced by 0, t, 2t in succession, it gives 


3 log (c) + 48 log C(o + it) + RK log C(o 4+ 2it) > 0. 


Hence 
(2) C(a\lC4(o + it\t(o + 2it)]| > 1 


‘For an account of this, see Hardy and Wright, Introduction to the Theory of 
Numbers (4th ed.), Chap. 22. 


2 Sitzungsber. Akad. Wiss. Wien., Math—Naturwiss. Classe, 107, 1429-1434 (1898). 
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fora > 1. As o > 1, we have ((a) ~ (o — 1) '. If (1 4+ it) = 0 for 
some t (which is necessarily not 0), then 


IC(o + it)| < A(o — 1) 


for some constant A, as o — 1. Since ((o + 2it) remains bounded as 
o — 1, we get acontradiction to the inequality (2). It will be seen that 
the success of the proof depends on the fact that the coefficient 4 
in (1) is greater than the coefficient 3. 

The argument was extended by de la Vallée Poussin in 1899 to 
show that ¢(s) # 0 in a thin region to the left of o = 1, the breadth 
of which at height t is proportional to (log t)~ ' for large t. In proving 
this, it is more convenient to work with the function ¢'(s)/¢(s) than 
with the function log ¢(s), since the analytic continuation of the latter 
to the left of ¢ = 1 is obviously difficult, whereas the former has its 
only poles for o > 0 at the zeros of ¢(s). By logarithmic differenti- 
ation of the Euler product, as in (4) of §7, we have 


fo. 6) 


—RC(s)/C(s) = ¥ A(n)n™? cos(t log n) 


n= 


foro > 1. Hence, by the same argumentas before, 
(a) C'(o + it) . C'(o + 2it) 
2 s|- al _ -s (0 + 7 : | "a + ii 2 


The behavior of —{'(c)/{(c) as o > 1 from the right presents no 
difficulty ; in view of the simple pole of C(s)ats = 1, we have 


(a) 1 
“ey Ge 


for 1 < o < 2, where A denotes a positive absolute constant (not 
necessarily the same at each occurrence). 

The behavior of the other two functions near o = 1 is obviously 
much influenced by any zero that ¢(s) may have just to the left of 
o = 1, at a height near to t or 2t. This influence is rendered explicit 
by the partial fraction formula 


Cs) 1 1 T's + 1) I 1 
arash Let, 


p 


+A 


2 


which was (8) of §12. The I term is less than A log t if t > 2 and 
1 < o < 2. Hence, in this region, 
C(s) 1 I 
(4) —R—— < Alogt — } Rj —— + -]}. 
as bee Pera 
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The sum over p is positive, since 


1 _ 1 
egal B and ee, 
p |pl 


We obtain a valid inequality when s = o + 2it by just omitting the 
sum: 
C'(o + 2it) 

—R ———— < Alogt. 
6) (a + 2it) : 
As regards s = o + it, we choose t to coincide with the ordinate y of 
a zero B + iy, with y > 2, and take just the one term 1/(s — p) in the 
sum which corresponds to this zero: 

‘ it 1 
gp OE 2 opt 


C(o + it) o—fB 
Substituting these upper bounds in the basic inequality (3), we 
obtain 


4 3 
——. < ——_ + Alogt. 
oh Gea” °8 


Take o = 1 + 6/log t, where 6 is a positive constant. Then 


46 


B<1+ : 
logt (3 + Ad)logt’ 


and if 6 is suitably chosen in relation to A, this gives 


Cc 


<1; 
B logt 
where c is a positive constant to which a numerical value could be 
assigned. Thus we have proved: 

There exists a positive numerical constant c such that €(s) has no 
zero in the region 


In view of the fact that ¢(s) has no zero arbitrarily near ¢ = 1 with 
|t| < 2, we can also say that there exists a positive constant c such 
that C(s) has no zero in the region 


c 
= log(|t]| + 2) 
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The breadth of the zero-free region was enlarged to 
c log log t 
log t 
by Littlewood in 1922, and to? 


c(a) 
(log t)" 


for any « > 3, by Vinogradov and Korobov independently in 1958. 
These improvements depend on upper bounds for ¢(s) in a region 
just to the left of o = 1, which are deduced from somewhat elaborate 
estimations of exponential sums.* 


3 For the sake of simplicity, I give a slightly weakened version of the result. 
“For an account, see A. Walfisz, Weylsche Exponentialsummen in der neueren 
Zahlentheorie, Berlin, 1963, Chaps. 2 and 5. 
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ZERO-FREE REGIONS FOR 
L(s, x) 


There is no difficulty in extending the results of the preceding 
section to the zeros of L(s, x) when y is a fixed character. But this is of 
limited value ; for many purposes it is important to allow q to vary 
and to have estimates that are explicit in respect of q. This raises 
some difficult problems, and the results so far known are better for 
complex characters than for real characters. 

We no longer suppose that f > 2 but merely that t > 0. There is 
no loss of generality in the latter supposition, for the zeros of | 
L(s, xy) with t < 0 are the complex conjugates of the zeros of L(s, x) 
with t > 0. We are concerned with nonprincipal characters only, 
and therefore g > 3 throughout. 

Logarithmic differentiation of the Euler product formula gives 


Lis) _ € 
L(s, x) 2 


A(n)n = Pv(n)e —itlogn 
1 


foro > 1. Wecan represent the real part of y(n)e~ “"'°8", for (n, g) = 1, 
as cos 0, and @ has to be replaced by 20 if y is replaced by x? and ft by 
2t, and has to be replaced by 0 if y is replaced by x) and t by 0. 
Hence the analog of the inequality (3) of the preceding section is 


L'(o, Xo) Lio + it, x) 
(1) 3|- L(o, 1s) ' a —x L(o + it, a 


L' 2it, x? 
4 ae ET 55 
L(o + 2it, x*) 


If v is a real character (but only then) we have y* = yo, and this 
affects the argument. The effect is important only when t is small and 
we come under the influence of the pole of L(s, x9) at s = 1. 

We suppose first that y is a complex primitive character, and 
follow as closely as possible the argument of the preceding section. 
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Again the first term presents no difficulty ; we have 
(a) Y l 
C/o) oao-—1l 


_ L(G, Xo) 
L(o, Xo) 


= Vxoln)A(nn-? < — +c 
1 


for 1 < o < 2, where c, denotes a positive absolute constant (and 
similarly for c3,... later’). 

For the other two terms in (1), we have recourse to (17) of §12. 
This gives 


Lisy) ,,..9. 1. F'Gs + 5a) 
ae OR et 
L(s, x) nm 2 ITs + 3a) 

1 1 

2B) — RY —— + 

a \S— pp 


where a is O or 1. We can eliminate B(x) and 21/p by appealing to (18) 
of §12. Since the T term above is O[log(t + 2)], we can express the 
result in the form 

L's, x) 1 


<(@Ff—-)R : 
L(s, ~) ‘ d sS—p 


where we have written for brevity 


(2) —®R 


(3) L = logg + log(t + 2). 


This holds (for o > 1) for any primitive y, whether real or complex. 
Since 


we can as before omit the series or any part of it. 

We omit the whole series when estimating L'(o + 2it, y’)/ 
L(o + 2it, x7). There is the minor complication that y?, though 
nonprincipal, may not be primitive. However, if x, 1s the primitive 
character that induces y?, it follows from (3) of §5 that 


Lis, x7) Lis x] — p Plog 
L(s, x’) Ls, x1) 7 p\q aes p° 
< ¥ logp < logg. 


pla 


Hence the upper bound in (2), namely c, ¥, remains valid. 


' To leave the constants unnumbered, as we have done hitherto, would lead to 
confusion in the present section. 
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We choose t to be the ordinate y ofa zero f + iy of L(s, yx), and by 
retaining on the right of (2) only the one term corresponding to this 
zero, we obtain 


Lo + it, x) r 1 
Lo +ity ~~ 6-6 
The three estimates, when substituted in the basic inequality (1), 
give 
4 3 


F. 
oa 


Wetakeo = 1+ c4/¥, witha suitable c,,and by the same argument 
as in the preceding section we obtain 


(4) Pelee 


This has been proved for any complex primitive y, but the restriction 
to primitive y can be removed, since, by (3) of §5, any zeros of 
L(s, x) additional to those of L(s,y,), where y, induces y, are the 
zeros of a finite number of factors 1 — y,(p)p * and are on o = 0. 
We can accordingly assert that there exists a positive absolute 
constant cs such that, if y is a complex character to the modulus q, 
any zero B + iy of L(s, x) satisfies (4), where 


(5) SL = logg + log(ly| + 2). 


[We have modified the definition of # in (3) to accord with the 
choice t = y.| 

Suppose next that x is a real primitive character. The preceding 
argument needs modification only in one respect: The inequality 
for —RL’/L with s = o + 2it and y replaced by y” is no longer 
applicable since x is the principal character. We must now relate 
L'/L to C'/C, and by the same argument as that used above when y? 
was imprimitive, we have 


Ls, Xo) oe o(s) 
L{(s, Xo) ¢(s) 


foro > 1.As regards —¢'/C, we cannot quote the inequality (5) of §13, 
because this was proved only for large t. In proving it, aterm 1/(s — 1) 
was neglected. When this is restored, the same argument as was 
used there gives 


< logg 


1 
Ty HF + co log(t + 2). 
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Hence 

L'(o + 2it, x7) 1 

L(o + 2it, x”) o — 1+ 2it 
where # is again defined by (3). 


Using this in place of the stronger inequality that was available 
for complex y, we deduce from (1) that 


+ C1 Lf, 


4 3 


< +R L 
o—pBp oa-1 arn 1 ea, , 


where now t = y. If we take o = 1 + 6/¥, and postulate that 
y > 6/L, we get 


: ao ee a 
=f 8 560° 87? 
whence 
4—5c,6 6 
Pe goes 


If 6 is sufficiently small in relation to cg, we get an inequality of the 
form (4) but subject to the condition y > 6/Y, where Y is given by 
(5). This condition is satisfied if y > d/log g. We have therefore 
proved that there exists a positive absolute constant cy such that, 
if 0 < 6 < Cg and x is a real nonprincipal character to the modulus q, 
then any zero B + iy of L(s, x) for which 


r) 
yl = ines 
084 
satisfies 
rt) 
| erate 
B<1- <p 


where & is given by (5). We have omitted the requirement that y 
should be primitive, for the same reason as before. 

It remains to consider what can be proved about the zeros of 
L(s, x), for real nonprincipal x, with 


where 6 is a small positive constant. We shall show that there is 
at most one zero with o > 1 — 0'/logq for a suitable positive 
constant 0’ and that, if there is one, it must be real. The final clause 
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is in fact a corollary, for if there were a nonreal zero, there would 
be two zeros at conjugate complex points. 
The inequality (2), with s = a > 1, can be written 


L'(a, x) 
us < Cy logg- > 
x p 


G—p 


the last sum being real since the zeros occur in conjugate complex 
pairs. In quoting this inequality we have assumed, as we may without 
loss of generality, that x is primitive. If there were zeros at B + iy, 
where 7 # 0, we should have 


_L@% 
L(o, x) 


Aa — B) 
(o — pr +" 


< Cy) logqg — 


For the left side, there is the crude lower bound 


L(6,x)_ get, ey Renae cs GAOL, 2 talk 4, 
i Dxtn)A(n)n > 2, Alyn aie opel 
Thus 

1 , 24(a — B) 
ween a © I~ Gp ty 


We takeo = 1 + 26/log q; then 


re) 
ae ae): 
Og q 


and the last inequality implies that 


1 < C12 logq — 


5(o — B) 


If the 6 of the previous result is sufficiently small in relation to c,), 
we get B < 1 — d/log gq. 

The argument is substantially the same if, instead of two con- 
jugate complex zeros, there are two real zeros (or a double real 
zero). Thus we have proved: There exists a positive absolute con- 
stant c,3 such that, if 0 < 6 < c,3, the only possible zero of L(s, x) 
for areal nonprincipal y, satisfying 


ly B>1 


Cn; ee: 
log q log q 


is a single (simple) real zero. 
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The three results proved so far can be fitted together to give the 
following theorem, which we state for convenience of reference. It 
simplifies the statement to consider two cases according as |t| > 1 
or |t} < 1, since in the former case the number ¥Y is essentially 
log q|t| and in the latter case it is essentially log q. 

THEOREM. There exists a positive absolute constant c,4 with 
the following property. If y is a complex character modulo q, then 
L(s, x) has no zero in the region defined by 


fee if || > 1 
| log qlt| = 
(6) o> 
C14 ; 
[= t| <1. 
| ar if \t| <1 


If x is a real nonprincipal character, the only possible zero of L(s, x) 
in this region is a single (simple) real zero. 

These results are due partly to Gronwall? and partly to Titch- 
marsh.° 

We shall now prove a result due to Landau,* the effect of which is 
to assure us that if there exist values of g for which an L function 
formed with a real primitive character (mod q) has a zero with 
Bf > 1 — c/loggq, then such values of q are very rare. He proved 
that if x,, x are distinct real primitive characters to the moduli 
41542 respectively, and if the corresponding L functions have real 
zeros B,,B 2, then 


(7) min(B,, 8) < 1 — —S—, 
log 4142 


where c,; iS some positive absolute constant. The possibility that 
41 = G2 1S not excluded. 

The proof is based on the fact that y,(n)z,(n) is a character to the 
modulus q ,qg>, being multiplicative and periodic. It is not in general 
a primitive character, but it is nonprincipal. For if y,(n)y.(n) = 1 
whenever (n,q,q,) = 1, we should have y,(n) = y,(n) whenever 
(n, 4142) = 1, and this would mean that the primitive characters 
x¥, and y, would induce the same character to the modulus q,qp. 
This is impossible by the results of §5. 


2 Rendiconti di Palermo, 35, 145-159 (1913). 
3 Rendiconti di Palermo, 54, 414-429 (1930); 57, 478-479 (1933). 
* Géttinger Nachrichten, 1918, 285-295. 
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For o > 1, we have 


L(G, X1%2) 


_ <C,, lo : 
L(o, %1%2) pene ata 


this is proved by the same argument as that which we applied to 
L(s, x7) when x? was nonprincipal but not necessarily primitive. 
Further, by (2), 


1 
— p; 


the symbol § in (2) being now superfluous. 
Now consider the expression 


2 < C)7 logq; — 


C'(a) Lo, %1) Lo, X2) La, X1X2) 


((o) L(o,x%1) — L(o, x2) = Lo, x1 %X2) 
= YL AMfl + xi] + x2(n)|n-? = 0. 


On substituting the previous upper bounds, and also that for 


— ¢'(0)/C(a), we get 


1 oo 
—_——_ I 
oe Ro en R PG i el no 


If o is taken to be | + d/log q,q>, for a sufficiently small positive 
constant 6, the last inequality shows that B, and f, cannot both 
be greater than 1 — 6'/log q,q>, for a suitable positive 6’. This proves 
(7). 

Various deductions can be made from the last result. In particular 
one sees that for at most one of the real nonprincipal characters 
x (mod q) can L{s, x) have a zero in the region (6). [We assume here 
tacitly that c,4, in the definition (6), is diminished if necessary so as 
to satisfy c,4 < 4$c45.| 

Another deduction concerns the possible sequence q,,q>,... of 
positive integers q with the property that there is a real primitive 
x (mod q) for which L(s, x) has a real zero f satisfying 


(8) B > 1 — c,,/logq. 


If Cy9 is suitably chosen, say cig = 3C,5, then 


: 
qj+i1 > qj- 
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For (7) implies that 


C19 C15 


_ < | — ————__., 
log q; log 4j4j+1 


whence the result. 

A deduction made by Page? and applied by him to the prime 
number theorem for arithmetic progressions (see §20) concerns 
the set of positive integers q < z, where z > 3. If c, is a suitable 
positive constant, there is at most one real primitive y to a modulus 
q < z for which L(s, x) has a real zero B satisfying 


c 
(9) PS tae 
The last inequality is, of course, of a somewhat more stringent 


nature than (8). The proof is immediate; if there were two such 
characters, both the zeros would satisfy 


C20 { 2C29 


>1- > 1 —- ———_ 
: logz log qiqa’ 
and this would contradict (7) if cy) = $c¢5. 

If there is such an “‘exceptional” real character y, to a modulus 
qd, < Z, then q, will be a function of z, and the only real nonprincipal 
characters y to moduli gq < z for which L(s, y) has a real zero satis- 
fying (9) will be y, and the imprimitive characters induced by x. 
Their moduli will be multiples of q,. 

The only obvious general upper bound for a real zero B of an L 
function corresponding to a real primitive x is that which can be 
derived from the lower bound for L(1, x) provided by the class- 
number formula. Since h(d) > 1, the formulas (15) and (16) of §6, 
in which d = +g, give 


(10) L(1, x) > ergo? 
We can easily prove that 
(11) IL'(a, O| < C22 log?q ~—s for 1 — I/logqg <a <1, 


and it then follows, by the mean value theorem of the differential 
calculus, that 


L(l, x) = LiL, x) — L(B, x) < Cl — B)ec2 log? q, 


5 Proc. London Math. Soc., (2)39, 116-141 (1935). 
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whence 


C23 
(12) B<1- plea 
By the usual argument, this holds also for real nonprincipal y, even 
if imprimitive. If y(—1) = 1, which corresponds to the case d > 0, 
the inequality (12) can be improved® to the extent of a factor log q, 
since in (16) of §6 there is a factor log ¢, and log ¢ > log 4(1 + q?). 
The proof of (11) is as follows. We have 


Lo, x) = — Yxln)(log n)n~? 
1 


for o > 0. Since n-° = e °'°8" < en ' for n < q, we have 


q 
n)(log n)n~? oD (log n)n~* < cy,4 log? gq. 


By partial summation, noting that (log n)n ° decreases for n > q,. 
we have 


N 
3 x(n)(log n)n~*) < (log q)q~* max 2 x(n) 
n=q+1 q+ 


< (log q)eq™ 1g. 


These results imply (11). 
We remark, for convenience of reference, that the same argument 
applied to L(a, x) gives 


(13) |L(o, x)| < C25 logq for 1 — I/floggq<o<l. 


In §21 we shall prove a theorem due to Siegel, which establishes 
a much more precise inequality for a real zero than that given in (12). 
But whereas all the results of the present section have been “‘effec- 
tive,” in the sense that numerical values could be assigned to all 
the constants, it does not seem to be possible to derive an effective 
inequality from Siegel’s theorem. 


© Goldfeld and Schinzel (Ann. Scuola Norm. Sup. Pisa Cl. Sci., (4) 2, 571-583 (1975)), 
have shown that B < 1 — cq” 1? if y(—1) = —1, and that B <1 — cq™!/ logg if 
a(-)D=1. 
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In this section we prove the approximate formula for N(T), 
the number of zeros of C(s) in the rectangle 0 <0 < 1,0 <1t<T, 
which was stated by Riemann and established by von Mangoldt. 
It was stated as (1) in §8. 

It is convenient to work initially with €(s) rather than with C(s) 
because of its simple functional equation. Assuming for simplicity 
that T (which we suppose to be large) does not coincide with the 
ordinate of a zero, we have 


2nN(T) = Apgarg €(s), 
where R is the rectangle in the s plane with vertices at 
2 24i% -14+iT% -1, 


described in the positive sense. There is no change in arg ¢(s) as s 
describes the base of the rectangle, since €(s) is then real and no- 
where 0. Further, the change as s moves from 5 + iT to —1 + iT 
and then to —1 1s equal to the change as s moves from 2 to 2 + iT 
and then to > + iT, since 


E(o + it) = (1 — o — it) = &(1 — o + it). 
Hence 
mN(T) = A, arg ¢(s), 
where L denotes the line from 2 to 2 + iT and then to 4 + iT. 
The definition of &(s), in (1) of §12, can be written as 
G(s) = (s — 1a PGs + 1)0(5). 
We have 


A, arg (s — 1) = arg(iT — 4) = 4n + O(T~'), 
A, arg *5 = A,(—3tlog x) = —4T log nz. 
97 
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Also, by Stirling’s formula (§10), 


A, arg PGs + 1) = Slog F(4iT + 3) 
= SIT + } logGiT + 4) — siT — 5 
+ Flog 2x + O(T~')] 
= 4TlogiT —4T + 3n + O(T~’). 


Hence 

T T 7 5 
(1) N(T) 5, '° a oe S(T) + O(T *), 
where 


nS(T) = A, arg ¢(s). 
To prove (1) of §8, it suffices to prove that 
(2) S(T) = O(log T). 


This is one of the few estimates connected with ¢(s) that has not, 
as far as I know, been improved upon during the present century. 
Since arg ¢(2) = 0, we can express the definition of S(T) in the form 


nS(T) = arg (4 + iT), 


provided this argument is defined by continuous variation along 
L, or, equivalently, by continuous horizontal movement from 
+0 + iT to ++ iT, starting with the value 0. In view of our 
limited knowledge about S(T), it would seem at first sight that we 
might as well omit the term % in (1); but as we shall see later, it has a 
certain significance. 

We shall base the proof of (2) on the following 

Lemma. If p = B + iy runs through the nontrivial zeros of {(s), 
then for large T 


I 
5 = Ollog T). 


m as + (T-y) 


For the proof, we refer to (4) of §13, which states that 


¢'(s) 1 1 
ace A logt = ra{—— +) 


for 1 <o<2 and t>2. In this formula we take s = 2 + iT. 
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Since | ¢’/¢| is bounded for such s, we obtain 


> As 


+) < Alog T. 
p 


As we have seen earlier, all the terms in both series are positive, and 
since 


ee ee ee SCE eI 
s—p (2—fP +(T-y?~ 44+(T-»)” 


we obtain the assertion in the lemma. 

Two immediate corollaries are: (a) The number of zeros with 
T—1<y<T+1 is O(log T); (b) the sum X(T — y)~? extended 
over the zeros with y outside the interval just mentioned is also 
O(log T). , 

Another deduction is that for large t (not coinciding with the 
ordinate of a zero) and —1 <o < 2, 


(9) I 
(4) ts) = X cp + O(log t), 


where the sum is limited to those p for which |t — y| < 1. For by 
(8) of §12, applied at s and at 2 + it and subtracted, 


o(s) _ at I 
O(log t) + —— — ———_——_]. 
Us) B+) —p 2+it-p 
For the terms with |y — t] > 1, we have 
1 1 2-6 3 


—- | FE OO SO 
S—p sc9cal *eopOLi ea ly — ¢/? 


and the sum of these is O(log t) by (b) above. As for the terms with 
ly — t| < 1, we have |2 + it — p| > 1, and the number of terms is 
O(log t) by (a) above. Hence the result. . 

The estimate (2) for S(T) follows easily from (4). For the definition 
of S(T) implies that 


2+iT 
nS(T) = O(1)—f[  SILS/EIs)] ds, 
4£+iT 
the O(1) term coming from the variation along o = 2. Now 


2+iT 
[Ss — p)-' ds = Aarg(s — p) 
$+iT j 
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and this has absolute value at most z. The number of terms in the 
sum in (4) is O(log T) by (a) above, and therefore (2) follows. ! 

We have now proved the approximate formula for N(T), from 
which it follows, incidentally, that if the ordinates y > 0 are enumer- 
ated in increasing order as 7,, 7>,... then 


Yn, ~ 2mn/logn as n — oO. 


It does not follow that y,,, — y, ~ 0, but this result was proved by 
Littlewood in 1924. The formula for N(T) shows that 


N(T+H)—N(T)>AlogT  (T> Tp) 


if H is greater than some positive constant, and Titchmarsh proved 
the more precise result that this holds for any fixed positive H, 
with some positive A that depends on H. It may be noted that, in 
consequence, the estimate O(log T) in corollary (a) to the lemma is 
best possible. 

As regards the function S(T), it was proved by Littlewood that 


| : S(t) dt = O(log T), 


and this indicates a high degree of cancellation among the values 
of the function. The result just stated would, of course, become 
false if the term % had not been retained in(1). 

For proofs of the results just stated, and for other results relative 
to the zeros, see Titchmarsh, Chap. 9. 


' For another proof, see Titchmarsh, §9.4. 
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Let y be a primitive character to the modulus q, and let N(T, x) 
denote the number of zeros of L(s, y) in the rectangle 


0O<o<l, \t| < T. 


(It is no longer appropriate to consider only the upper half-plane, 
since the zeros are not in general symmetrically placed with respect 
to the real axis.) In the present section we prove the approximate 
formula for N(T, yx) which corresponds to that for N(T) proved 
in the preceding section. Since we regard N(T, y) as a function of the 
two parameters T and gq, it is no longer appropriate to suppose T 
arbitrarily large, and we merely assume that T > 2. The formula is 
(1) SMT, X) = 5 log A — 5. + O(log T + log q). 
[I have inserted a factor } on the left for ease of comparison with 
N(T), and to compensate for the rectangle being doubled. | 

The proof is on the same lines as for N(T). But it is convenient 
now to consider the variation in arg &(s, x) as s describes the rect- 
angle R with vertices at 


3—iT, 3+iT, -—3+iT, -3-iT, 


so as to avoid the possible zero at s = —1. This rectangle includes 


just one trivial zero of L(s, x), at either s = 0 or s = —1, and there- 
fore 


2n[N(T, x) + 1] = Ap arg &s, x). 


The contribution of the left half of the contour is equal to that 
of the right half, since 


arg €(o + it,y) =argé(l—o+it,xy)+¢, 


where c is independent of s. 
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By the definition of €(s, x) in (12) of §12, we have to form the sum 
of 


A arg(q/n)***#* = T log(q/n), 
Aarg P'(4s + 4a) = Tlog4T — T + O(1), 


and Aarg L(s, x), and then multiply by 2. The terms above give 
the main terms in (1), and it remains to prove, in effect, that 


(2) arg L(+ + iT, x) = O(log T + log q). 


This follows, as before, from the following modified 
Lemma. If p = B + iy runs through the nontrivial zeros of 
L(s, x), where x is primitive, then for any real t, 


1 
(3) Liygaoy? — OM 


where £ = log q(|t| + 2). 

The proof is as before, but the reference is now to (2) of §14. 

As in the preceding section, it follows from this lemma, in con- 
junction with (17) of §12, that for t not coinciding with the ordinate 
of a zero,and —1 <0 <2, 


ES Oe, 


® Us) 43-0 


where the sum is limited to those p for which |t — y| < 1. 

The approximate formula (1) implies, in particular, that for 
large q the number of zeros with |t| < To, where 7) 1s a suitable 
absolute constant, is greater than a constant multiple of log q. This 
shows that the estimate 


dj rig, O(log q) 
is essentially the best possible.! 

For some purposes it is convenient to have an analog of (1) for 
characters that are not necessarily primitive. If y is an imprimitive 
character, induced by the primitive character y,(mod q,), then (1) 
remains valid for N(T, x) as defined, provided we replace q by q. 


' For some results on the zeros of L(s, 7) when q is large and t is bounded, see 
Siegel. Annals of Math., 46, 409-422 (1945), or Gesammelte Abhandlungen Il, 47-60. 


THE NUMBER NV(T, x) 103 


But if Np(T, x) denotes the number of zeros in the rectangle R 
defined above, we must include the zeros on o = 0 of 


TI! — xstp)p 4], 


p\q 


in accordance with (3) of §5. These are (for each p not dividing q,) 
spaced at equal distances 2z/log p apart. Their number, with 
|t| < T, is 


“ps (T log p + 7 = O(T log q). 
pla 
Hence 


T T 
(5) N,(T, x) = 7 857 + O(T log q) for T > 2. 
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THE EXPLICIT FORMULA 
FOR w(x) 


In this section we shall prove von Mangoldt’s formula for W(x), 

which was stated in §8. We recall that 
w(x) = ¥ An) =  logp. 
nsx psx 

This function has discontinuities at the points where x is a prime 
power, and in order that the formula may remain valid at these 
points, it is necessary to modify the definition by taking the mean of 
the values on the left and on the right. In other words, we define o(x)’. 
to be w(x) when x is not a prime power, and w(x) — $A(x) when it is. 
The formula asserts that, for x > 1, 


(1 (x) = x — Y— — +— — = log(l — x~?), 
) Wo(x) d p (0) 2 g ) 
where the sum over the nontrivial zeros p of ¢(s) is to be understood 


in the symmetric sense as : 


p 
lim y ~. 
T> 0 }y|<T P 
The value of the constant ¢'(0)/€(0) is log 27, as can be deduced from 
(8) and (10) of §12. The last term of the formula is equivalent to 
—2Z,,x°/w extended over the trivial zeros of ¢(s) given by w= 
—2, —4, —6..... 

To avoid some minor complications we shall suppose that x > 2, 
though as stated above the formula is valid for x > 1. 

The general lines on which sucha formula can be proved, provided 
that the argument can be justified, were indicated by Riemann in 
connection with his explicit formula for n(x). The basic idea is to 
use the discontinuous integral 


eae 0 fO<y<l, 
(2) 1 yo 4 fy=l 
2ni ee oe : 
= 1 ify > 1, 
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where c > 0, to pick out the terms in a Dirichlet series with n < x, 
by taking y = x/n. Since 


y A(nn-* = —C(s)/C(s) 
n=1 


for o > 1, the result takes the form 


1 CHICO t'(s) xs 
Wolx) = ani Pex LES 4s 

for c > 1. If we can move the vertical line of integration away to 
infinity on the left, we shall express Wo(x) as the sum of the residues 
of the function [ — C'(s)/€(s)]x*/s at its poles. The pole of {(s) at s = 1 
contributes x; the pole of 1/s at s = 0 contributes —¢'(0)/C(0); and 
each zero p of ¢(s), whether trivial or not, contributes — x°/p. 

To carry out this proof, we have to start with an integral from 
c — iTtoc + iT, and regard this as one side of a rectangle extending 
to the left. It is necessary to choose T with a little care, so that the 
horizontal sides of the rectangle shall avoid, as far as possible, the 
zeros of ¢(s) in the critical strip. After the argument has been carried 
out in detail, we shall have a finite form of (1), with an explicit 
estimate for the error; and this will be much more useful than (1) 
itself. 

As a first step we prove the following 

Lemma. Let 0(y) denote the function of y on the right of (2), and 
let 

c+iT 
I(y, T) = ue y ds. 
2ni} 


iT : 

Then, for y > 0,c > 0, T> 0, 

ymin(l, T™ "log y|"') if y #1, 
cT™! if y=. 


Proof. Suppose first that 0 < y < 1. The function y*/s tends to 
0 as o ~ +, and does so uniformly in t. Hence we can replace the 
vertical integral by two horizontal integrals: 


1 s 1 o-iT 
ints “as +5 Y ds. 
i c-iT 


W(y, T) — oy)| <{ 


Now 
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and similarly for the other integral. This proves one of the two in- 
equalities. The other is most easily obtained by replacing the vertical 
path by a circular path with center O, on the right side. The radius is 
R = (c? + T’)}, and on the circular arc we have |y‘| < )“ and 
|s| = R. Hence 


l y’ 
I(y, T)| < —aR~ < y’. 


The proof when y > | is similar but uses a rectangle or circular 
arc to the left. The contour then includes the pole at s = 0, where 
the residue is | = d(y). 

There remains the case y = 1, which ts easily treated by direct 
computation. With s = c + it, we have 


1, T) =< * _2¢ ee du 
(I, al Fae = Te 1+ 4? 


and the last integral is less than c/T. This proves the lemma. ' 
Applied to o(x), the result of the lemma gives 


(3) bholx) — lx, TH < Y. Aln)(x/nf min(1, T~ log x/n|~}) 


n=1 
n#x 
+ cT~ 'A(x), 
where c > | and 
1 c+iT t'(s) xs 
= ———|—d 
(4) Joy 7) = 5a] | alee 


It is to be understood that the term containing A(x) is present only 
if x is a prime power. 

We choose c = 1 + (log x)~', since this gives a good result 
without excessive work, and note that x‘ = ex. We have to estimate 
the series on the right of (3), and we take first all terms for which 


‘It is an interesting exercise to prove the results for y < 1 and y > 1 by real 
variable methods. 
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n < 3x or n > 3x. For these, |log x/n| has a positive lower bound, 
and so their contribution to the sum is? 


< xT"! y A(n)n7-* = xT7} 2 < xT ‘(log x). 


n=1 

Consider next the terms for which 3x <n < x. Let x, be the 

largest prime power less than x; we can suppose that 3x < x, < x, 

since otherwise the terms under consideration vanish. For the term 
n = X,, we have 


X— xX, xXx — X41 
> ——- 


log~ = —log f — 
n 


> 


x 


and therefore the contribution of this term is 


; x x 
< A(x,) min h ao < (log x) min h, | 


For the other terms, we can put n = x, — v, where 0 < v < 4x, and 
then 


x x 
log — > log— = —-1 
as eS 0g 


Hence the contribution of these terms is 


< Yo A(x, — v)T™'x\/v « xT~ ‘(log x)’. 


O<v<4x 


The terms with x <n < 3x are dealt with similarly, except that x, 
is replaced by x,, the least prime power greater than x. 

It is convenient to write <x) for the distance from x to the nearest 
prime power, other than x itself in case x is a prime power. Collecting 
the estimates, we deduce from (3) that 


x(log x)? 


(5) |Wo(x) — J(x, T)| < + (log x) min 


pak 
“EO 
The next step is to replace the vertical line of integration in (4) by 
the other three sides of the rectangle with vertices at 
c-if, c+ifT —U+il, -U-iT 


2 From now on, we make use of Vinogradov’s symbolism A < B, as an equivalent 
for A = O(B). 
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where U is a large odd integer. Thus the left vertical side passes 
halfway between two of the trivial zeros of C(s). The sum of the 
residues of the integrand at its poles inside the rectangle is 
xe £0) al 
(6) x — ra aa. 
ly|<T p C( ) O<2m<U m 


The choice of T demands consideration. We saw in a corollary 
to the lemma of §15 that, for any large 7; the number of zeros with 
ly — T| < 1 is < log 7’ Among the ordinates of these zeros there 
must be a gap of length > (log T)~‘. Hence by varying T by a 
bounded amount, we can ensure that 


ly — T| > (log T)~* 
for all the zeros P + iy. 
We recall further the result of §15 that 


O(s) _ 
C(s) ly-T]<15 — P 


+ O(log T) 


fors =o + iTand —1 <o < 2. With the present choice of T; each 
term is <log T and the number of terms is also <log T; so that on 
the new horizontal lines of integration we have 

C(s) 

-—— = O(log? T) for -l <a <2. 

C(s) 
The contribution made to the horizontal integrals by this range of o 
is therefore 


c 


da « 


xs 


log’ T[“ x log? T 
T 


(7) < log? T x' do < ; 
T log x 


ca | 


It remains to estimate the contribution made by the horizontal 
lines of integration for —U <o< —1 and by the vertical line 
o = —U. We need an estimate for |¢'/¢| for o < —1, and we shall 
prove that 


(8) IC"(s)/C(s)| < log(2|s}) 


in this half-plane, provided that circles of radius + (say) around all 
the trivial zeros at s = —2, —4,... are excluded. It will follow that 
the contribution of the remainder of the horizontal integrals is 


log 2T [’ log T 
< 28 | ode < 87 
an Tx log x 
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which is negligible compared with (7), and the contribution of the 
vertical integral is 


log 2U 7 he T log U 
< ef aa 


T 
which vanishes as U - oo. 
Adding the estimate in (7) to that in (5), and making U — oo in (6), 
we obtain 


p ‘(0 
(9) Wo(x) = x — Oe, = am — 5log(1 — x~*) + R(x, T), 
where 
x log?(xT) x 
(10) IR(x, T)| <« ap + (log x) min |1, Txy ; 


The estimate (8) is deduced from the functional equation, which is 
best taken in its unsymmetric form |(4) of §10] 


¢(1 — s) = 2' Sx (cos 4ns)I(s)C(s), 


since, if 1 — o < —1 the functions on the right have to be considered 
only for o > 2. The logarithmic derivative of the right side, apart 
from an added constant, is 


wS 


I(s)  ¢(s) 
+ 
I(s) — C(s) 


The first term is bounded if |s — (2m + 1)| > 4, that is, if 


—4n tangas + 


(1 — s) + 2m| > 5. 


The second term is <log|s|, and therefore <log 2|1 — s| for o > 2. 
The third term is bounded. Hence (8) follows. 

The results (9) and (10) constitute the more precise form of 
the explicit formula (1). As T- oo for any given x > 2, we have 
R(x, T) + 0, and therefore (1) follows. The convergence is uniform 
in any closed interval of x which does not contain a prime power, 
but not otherwise, since w(x) is discontinuous at each prime power 
value of x. 

We proved (9) and (10) subject to a restriction on T, but this can 
now be removed. The effect of varying T by a bounded amount is to 
change the sum over p by O(log T) terms, and each term is O(x/T). 
Hence the variation in the sum is O[x(log T)/T], and this is covered 
by the estimate on the right of (10). 
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We note for future reference that, if x is an integer, then <x> > 1, 
and (10) takes the simpler form 


(11) |R(x, T)| < x(log xT)?T™!. 


The results (9) and (10) continue to hold? for 1 < x < 2. witha 
slight modification in the form of the estimate for R(x, T). 


* For this. and for a discussion of the series Xx’/p whenO < x < |.see Ingham. p.81. 
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THE PRIME NUMBER THEOREM 


We shall now deduce, from the results of the last section and those 
of §13, that 


(1) Wx) = x + O{x exp[ —c(log x)*]}, 


and from this the analogous result for x(x), which includes the prime 
number theorem. This is by no means the easiest way of proving 
the prime number theorem, but it is an instructive way. It is also very 
close to the method used by de la Vallée Poussin, though he worked 
with the function 

Wilx) = D(x — nA(n) 
instead of the function (x). 

The main question 1s that of estimating the sum Xx°/p in (9) of the 
preceding section, and obviously any effective estimate must be 
deduced from the fact that the real part f of p is not too near 1. It 
follows from the result of §13 that if|y| < T, where T is large, then 
fb < 1 —c,/log T, where c, is a positive absolute constant. Hence 


|x?} =x? < x exp[—c,(log x)/(log T)]. 
Also |p| > y, for y > 0; so it remains to estimate 


1/y. 


O<y<T 
If N(t) denotes, as in §15, the number of zeros in the critical strip 
with ordinates less than t, this sum is 


T 


je dN(t) = = MT) + | t~*N(t) dt. 


0 0 


Since N(t) < tlog t for large t, this is <log” T. Hence 


py 


lyl<T 


~ | < x(log T)* exp[ —c, (log x)/(log T)]. 
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We can take x to be an integer, without loss of generality. It 
follows from (9) and (11) of §17 that 


x log?(xT) 


W(x) — x1 <= 


+ x(log T)’ exp| —c (log x)/(log T)], 


for large x. If we determine T as a function of x by 
(log T)? = log x, 
so that 
T~! = exp —(log x)*], 

we get 

I(x) — x| < x(log x)? exp[ —(log x)?] + x(log x) exp[ —c,(log x)?] 

< x exp[ —c,(log x)*], 

provided c, is a constant that is less than both 1 and c,. This proves 
Le transition to an asymptotic formula for x(x), instead of for 


w(x), is elementary and is essentially an exercise in partial summation. ~ 
First we pass to the function 


A(n) 
T(x) = logn’ 


This is expressed in terms of the function p(x) by 


sd 1 
x An) | ¥ A(n) 


+ 
n<x n tlog* t log X =x 


7 (x) 


“W(thdt — W(x) 


; tlog?t logx 


The effect of replacing w(t) by t is to give 


, x ee Zz 
—— = lix + —., 
* log x log 2 


on integrating by parts. Thus it remains to consider the estimate for 
the error term, which is 


< is exp[ — c,(log t)?] dt + x exp[—c,(log x)*]. 


The contribution of the range t < x* to the integral is trivially less 
4 4 


than x*, and in the rest of the range we have (log t)? > 4(log x) 
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Hence 
m,(x) = lix + O{x exp[ —c3(log x)*]}, 
where c3 = 3c). 


Finally, since 


mx) = lix + O{x exp[ —c;(log x)*]}. 


This is the form of the prime number theorem proved by de la 
Vallée Poussin in 1899. It was improved to 


mx) = lix + Ofx exp[ —c(0)(log x)’]} 


for any 0 < 2, by Vinogradov and Korobov in 1958. The improve- 
ment comes from the result on a zero-free region for ¢(s), mentioned 
at the end of §13. One uses this in the explicit formula for w(x), and 
chooses T so that (log T)'** = log x, where « is any number greater 
than 4. 

The assumption of the Riemann hypothesis implies a much better 
estimate for the error term, as was pointed out by von Koch in 1901. 
We then have |x| = x?, and as we proved earlier that £1/|p| over 
0 <)> < Tis O(log’ T), the explicit formula gives 


W(x) — x| < x? log? T+ xT! log? xT, 
if x is an integer. Choosing T = x?, we get 
W(x) = x + O(x? log? x). 
From this it follows, by the same argument as above, that 
mx) = lix + O(x? log x). 


The situation is exactly similar, with x® in place of x?, if one assumes 
only that all the zeros have B < ©, where © is a number between 
1 
> and 1. 

There is also an implication in the opposite sense, by an argument 
which is quite elementary. If we assume that 


W(x) = x + O(x") 
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for some fixed « < 1, it follows that all the zeros of €(s) have B < a. 
For if o > 1, we have 


ly 2. 
a2 s 
ay eon 


and this is easily rearranged in the form 


C(s) ae 
ae ‘| W(x)x ~*~! dx, 


as on similar occasions earlier. If(x) = x + R(x), we get 


C(s)_ ss = 
pasa ts] Ro Pax 


The supposition that R(x) = O(x*) implies that the integral re- 
presents a regular function of s for o > a, and then C(s) can have no 
zeros in this half-plane. 

There is the curious conclusion, from the last two results, that if 
w(x) = x + O(x®**) for each ¢ > 0, where © is a fixed number 
between § and 1, then necessarily 


W(x) = x + O(x® log? x). 


Grosswald! has shown that if © is strictly larger than 4 then the 
factor log? x can be deleted. 


'C. R. Acad. Sci., Paris, 260, 3813-3816 (1965). 
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THE EXPLICIT FORMULA FOR 
w(x, x) 


For any character y to the modulus q, we define 


(1) W(x, x) = J) x(n)A(n). 


nsx 


These sums play much the same part in the prime number theorem 
for arithmetic progressions as that played by (x) in the prime 
number theorem itself, but now there is an aggregate of $(g) such 
sums, one for each character, instead of a single sum. In this section 
we Shall establish the explicit formula that is analogous to that 
proved for w(x) in §17. As there, we modify w(x, x) to Wo(x, x) in 
case x is a prime power. 

The general lines of the argument are the same as in §17, but with 
L'/L in place of ¢'/C. Suppose x is a primitive character (mod q). 
Consider first the computation of the residues of 


_ Ls, v4) x° 
L(s, x) s~ 


The only poles are at the zeros of L(s, y) and at s = 0; and there is 
a slight complication in that, if y(— 1) = 1, one of the zeros of L(s, x) 
is itself at s = 0, so that the function has a double pole there. 

Suppose first that y(—1) = —1. Then the complication just men- 
tioned does not arise, and the explicit formula is 


(2) ¥ol% 1) = — L— 


the expression on the right being the sum of the residues of the 
function mentioned above. There is the same understanding about 
the sum over the nontrivial zeros p of L(s, x) as in §17. The value of 
L'(0, x)/L(0, x) can be expressed in terms of the constant B(x) of 
§12 by putting s = 0 in (17) of §12. 
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Suppose next that y(— 1) = 1. Since L(s, x) has a simple zero at 


s = 0, the expansion near s = 0 of L’/L is of the form 


Lis”) _ | | 
ion ; 


where b = b(xz). Since 


Ss 


x I 
— =~ + (logx) +, 
s § 


the residue of the function mentioned above at s = Ois — (log x + 5). 
The explicit formula therefore takes the form 


x7 2m 


x = 

= - > —- log x — bly) + : 

(3) Wolx, X) d 5 eee a ae 

Once again, b(y) can be expressed in terms of B(x) by using (17) of 
§12. 

We now outline the proof, and the estimation of the error term 
when the sum is taken over |y| < T. We suppose that x > 2 and 
T > 2. The character y(n) plays no part in the estimation of the 
sum on the right of (3) of §17, and therefore the inequality analogous 
to (5) of §17 is still valid. In the choice of a modified value of T, 
we have to appeal to the lemma of §16 instead of that of §15, and 
accordingly we get 


Lio + iT, x) 2 


= O(log” gT for —1 aD. 
iene ee 


The contribution made to the horizontal integrals by this range of ¢ 
is therefore 
x log? qT 
ie 
T log x 


The estimate for L’/L in the half-plane o < —1, when circles of 
radius 4 around the trivial zeros are excluded, is 


Ls, x) 
Lis, 0) 


This follows by logarithmic differentiation from the functional 
equation of L(s, x) in its unsymmetric form, namely, 


= Oflog(q|s|)]. 


LU — s, x) = &(y)2' x Sq5~* cos 4n(s — aI (s)L(s, 7), 
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where |e(y)| = 1 and a = O or 1. This form is deduced from the 
symmetric form in the same way as (4) of §10. The contribution of 
the rest of the horizontal integrals is therefore 


log qT 
gave AT 
Tx log x’ 
and as before it is negligible. 
The result is that 
x? 
(4) Vix =—- Y —— (1 — ajlogx — by) 
lv}<T 
fo) a-2m 
R(x, T), 
lee 2m — Pas (x ) 
where 
(5) \R(x, T)) <= log? gxT + (log x) min (1, ——~ 
x, T) 7 oe 4 g Tix) 


Again, if x is fixed, this tends to 0 as T > o0, and so we have the 
results given in (2) for the case a = | and in (3) for the case a = 0. 
If x is an integer, we can replace (5) by 


(6) IR(x, T)| < xT ~! log2(qxT). 


From the point of view of application to the distribution of primes 
in arithmetic progressions with a variable modulus, formula 
(4) is of little use as it stands. It contains the unknown D(y), and it 
contains terms x°/p for which p may be very near either 1 or 0. 
It will be recalled that the results of §14 state that there is at most 
one zero within a distance c/logq of s = 1 (and so also of s = 0), 
and this one zero can only occur when x Is a real character and is 
itself real. It is important to have this zero visible explicitly in the 
formula. 

We need no longer distinguish between w and Wo, as we are not 
aiming at exactitude, and we can simplify (4) by absorbing log x 
and the sum over m into the error term. We can use the form (6) 
of the error term, since the effect on Wo(x, y) of replacing x by the 
nearest integer is O(log x); and for simplicity we suppose now that 
T < x. Then 


(7) Vn=- => b(x) + Rylx, T), 


\y|<T 


(8) IR, (x, T)| < xT! log? qx. 
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The first step is to express b(y) in another form. in (17) of §12, 

L'(s, I 1 (4s + 4a re 
(SX) og = f ee je) 

L(s, x) 2 nm 2 1(gs + 7a) po p 


Replacing s by 2 and subtracting, we obtain 


L'(s, ¥) 1 T(4s + 4a) 1 1 
= O11) - sae +b = 
L{(s, x) 21Gs+3za) F\s—p 2- p 
where the O(1) is absolute. If y(/—1) = —1, so that a = 1, the term 


I’/I is regular at s = 0; if y(—1) = 1, so that a = 0, its expansion 
near s = 0 is s-' + const. ++. Hence the number b(y), which we 
defined earlier as the value of L’(0, y)/L(O, x) in the former case, and 
as the constant term in the expansion of L’'(s, x)/L(s, x) near s = 0 
in the latter case, satisfies 


I I 


For the terms in this series with |y| > 1, we have 
1 I 
2p 


1 | 
= 2 a eee, 
Xe, |p(2 — p)| dp — pl? 


ly] =1]P 
The last sum can be estimated as O(log q) using (3) of §16 with t = 0. 
The same estimate applies to 
<a 2 ~ 


since for |y| < 1 we have |2 — p| > |2 — ‘s It follows that 


b(x) = O(log q) — se = 


pier P 
We can therefore rewrite (7) as 
(9) W(x, ) = - ee Y= + Ribs, 7), 
lyi<T P Iy}<1 P 


where R,(x, T) satisfies (8). 
By the theorem of §14, there is no zero of L(s, x) satisfying 
(10) yl<1, B> 1 — c/logg, 


except possibly when y is real, when there may (as far as we know) 
be one simple real zero. Here c is a numerical constant, which we 
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can suppose less than 4, whence f > 3, since q > 3. We call such 
a real zero exceptional and denote it by B,. There will also be a 
zero at 1 — f,. 

Let &’ denote a summation over the zeros which excludes the 
possible zeros B, and 1 — f,. Then we can rewrite (9) as 
x? ee ae oa 


W(x, X) Sa y a 


1 
+ _ + R,(x, T). 
yl<T P ly}<1 P By i—p, . 


The second sum can be absorbed in the error term, since 


p' = O (log q) 
for the zeros in question, and their number is O(log q) by (1) of §16 
with T = 2. We can also omit the term £; ', which is O(1). Finally, 
gets 
ee ee eo 
ch, x’ log x 
for some o between 0 and | — f,, and the last expression is less than 
x# log x. 
We now have the convenient expression (valid for primitive y 
and 2 < T <:x) 


xh _ xP 

(11) Wxy=—->=-- Yo — + R3(x, T), 
By ly} <T p 

where 

(12) IR3(x, T)| < xT~' log*(qx) + x* log x. 


The term — x’!/B, in (11) can only occur if y is real. 

Finally, we prove that (11) and (12) hold for any nonprincipal 
character x, whether primitive or not. Suppose y is imprimitive 
and is induced by the primitive character y, (mod q,). The difference 
between (x, x) and W(x, 7,) 1s at most 


YY Atn)=> ¥ logp <« (log x) ¥ log p < (log x) (log q), 


hee plq pote p\q 

which is negligible compared with the expression in (12), where 
T < x. This expression applies to the error term in the formula for 
W(x, 7,) because g > qy. 

There is, however, a logical point that needs consideration. The 
definition of the exceptional zero f, is a definition that involves 
the modulus, and, assuming we use the same definition when 7 
is imprimitive, an exceptional zero for L(s, y) will certainly be an 
exceptional zero for L(s, x,), but not necessarily vice versa. However, 
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if a zero is exceptional for 7, but not for y, the term — x*'/B,, which 
is explicit in the formula for w(x, y,), will still be present in the 
formula for w(x, x), since it will occur there in the sum —)"'x?/p. 

Thus the formula remains valid, and we restate it for convenience 
of reference: 


If y is a nonprincipal character to the modulus q, and 2 < T < x, 
then 


xh , x? 

(13) WxN=->=-- — + R,(x, T), 
B: ier Pp 

where 

(14) |R3(x, T)| < xT~' log? qx + x? log x. 


The term —x*'/B, is to be omitted unless y is a real character for 
which L(s, x) has a zero B, (which is necessarily unique and simple) 


satisfying 
(15) B, > 1 —c/logg, 
where c is a certain absolute constant; and the sum x’ excludes 8, 
and 1 — B, (if they exist). 
As we saw in §14 there is at most one real character (mod q) for 


which such a zero ff, can exist. It may be noted that the term 
A . . . 
x* log x in (14) can be omitted unless B, exists. 
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THE PRIME NUMBER THEOREM 
FOR ARITHMETIC 
PROGRESSIONS (1) 


We now apply the last result of the preceding section to obtain 
approximations to 


(1) Wix:ga)= YY A(n). 
wae cdl 
From this it is an elementary matter to deduce approximations 
to m(x;q, a), the number of primes up to x that are congruent to 
a(mod q). 
The relationship between W(x;q, a) and the sums (x, x) of the 
preceding section follows immediately from (4) of §4; we have 


(2) W(x 5q, a) = Y Hay(x, x), 


oa 4 
where the sum is over all the characters x to the modulus q. 

The contribution of the principal character x9 to the sum on 
the right provides the main term. By an argument similar to one 
used in the preceding section, we have 


Wx, Xo) — W(x) < YY A(n) < (log g)(log x). 


nsx 
(n,q)>1 


By de la Vallée Poussin’s form of the prime number theorem, 
namely (1) of §18, 


W(x) = x + O{x exp[ —c,(log x)*]}, 


where c, is a positive constant. Hence 


3 SR ne ee 
(3) W(x3q,a)= a +a 2 awe 


+O waa exp[ —c, (log x)?] + log? gx}}. 
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For w(x, x) when x # Xo, we have the expression (13) of the last 
section, namely 


xh / x? 
(4) W(x, x) = — ra — + R,(x, T), 
1 ly<T P 
where 
[R3(x, T)| < xT~! log? gx + x* log x 
provided 2 < T < x. The term in (4) containing f£, occurs for at 
most one real nonprincipal y. 


By the results of §14, and the remarks at the end of §19, all the 
zeros p in the sum on the right of (4) satisfy 


B <1 -—-c,/loggT 
for a certain positive absolute constant c,. Hence 
|x?| = x’ < x exp[—c,(log x)/(log gT)]. 


The sum Z|p|~' extended over the zeros in (4) with |y| > 1 can be 
estimated as in §18, and is 
T T 
< | t~7N(t, x) dt < | t~' log(qt) dt <log?qT < log? qx. 
1 1 


The same sum over the zeros p with |y| < 1 is O(log? q), since 
|p| * = O(log q) for each of them. Hence 


xh 
(5) W(x, x) = — >= + R,{x, T), 
B, 
where 
(6) IR4(x, T)| < x(log* qx) exp[ —c,(log x)(log qT)] 


+ xT~! log? gx + x* log x. 


Some condition must be imposed on the size of q in relation to 
that of x. If we suppose that 


(7) q < exp[C(log x)*], 
where C is any positive constant, and choose 

T = exp{C(log x)*], 
then all the terms on the right of (6) are 


< x exp[—C’(log x)*] 
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for some positive C’ depending only on C. Hence, subject to (7), 
we have 


(8) W(x, x) = —x'*/B, + O{x exp[—C’(log x)*]} 


for each nonprincipal y to the modulus q. 
Substituting in (3), and recalling that the term containing pf, 
occurs for at most one y, we get the following result for w(x; g, a): 
Let C be any positive constant. Then 


Zale — dala)x”! 
~ oq) P(Q)B, 


for a positive constant C' depending only on C, and this holds uni- 
formly with respect to q in the range (7). Here x, is the single real 
character (mod 4q), if it exists, for which L(s,z,) has a real zero B, 
satisfying B, > 1 — c/loggq for a certain positive absolute constant c. 

It is in the possible term containing B, that one of the main diffi- 
culties in the theory of the distribution of primes in arithmetic pro- 
gressions shows itself. The only universal upper bound that we 
have for f, is (12) of §14, which states that B, < 1 — c3/q? log’ gq. 
The term containing £, is therefore 


(9) — W(x3q, a) + O{x exp[—C’(log x)*]} 


< Pag —C 
$(q) P| 


This will only be of the same order as the other error term in (9) if 
we impose a very severe limitation on q, such as 


log x | 
*g? log? q] 


(10) q < (log x)'~? 


for some fixed 6 > 0. We then obtain the following result. 
Provided q satisfies (10) for some fixed 6 > 0, we have 


(HI) Weeiq a) = 2 + Ofxexpl—callog x)']}, 
(4) 
where Cc, is an absolute constant. 

This is a weak result, but as far as 1 know it is as yet the only result 
of the kind (apart from minor variations) that is effective, in the 
sense that, if 6 is given a numerical value, both c, and the constant 
implied by the symbol O can be given numerical values. 

As Page showed, we can obtain a similar result in the wider range 
(7), provided q does not coincide with a multiple of a particular 
integer g , depending on x. In his result, given in §14, we take 


z = exp[C(log x)*]. 
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Then the result tells us that there is at most one real primitive charac- 
ter toa modulus not exceeding z for which 


C5 C5 
1 —-—- = 1 - ———_,. 
Oe log C(log x)? 

Denote the modulus of this character (if it exists) by qg,. Then if q 1s 
not a multiple of g,, we have 


C5 


< 1 — ———_; 
i C(log x)? 


for every real nonprincipal y(mod q), whether primitive or not. 
We then obtain the same type of estimate for the term containing 
B, as before. Note that, if g, exists, it must satisfy 


re peels s1l= a en 
C(log x)? qi log’ q, 


that is, 
(12) q, log* gq, > log x. 


Hence we have proved: 
Let C be any constant. Then, except possibly if q is a multiple of a 
particular integer q, depending on x, we have 


wei 
$(q) 


for a positive constant C” depending only on C, and this holds uniformly 
with respect to q in the range (7). The integer q , satisfies (12). 

In the next section we shall prove Siegel’s theorem, which gives a 
much better upper bound for £, than we have had hitherto, and then 
in §22 we shall return to the question of the distribution of primes in 
arithmetic progressions. 

We conclude this section by stating the consequences of the 
generalized Riemann hypothesis, that is, the hypothesis that not 
only C(s) but all the functions L(s, 7) have their zeros in the critical 
strip on the line o = 4. (This conjecture seems to have been first 
formulated by Piltz in 1884.) Then 


(13) W(x54, a) = + O{x exp[—C’(log x)*]} 


W(x) = x + O(x? log? x), 


as we saw in §18, and the same holds for W(x, 7), by the inequality 
for |W(x, Xo) — w(x)| at the beginning of this section, provided we 
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suppose (say) that g < x. When x # Xo, (13) of §19 implies that, on 
the above hypothesis, 


Wx, YI <x? +x? Y [pl>' + xT ' log? gx + x* log x 


ly|<T 


for2 < T < x.As proved earlier in this section, 
¥ lol! < log? gx. 
Taking T = x?, we get 
Wx, x) < x* log? x 


for zy # x, andq < x. It now follows from (2) that on the generalized 
Riemann hypothesis, if q < x, 


(14) W(x3g, a) = ah + O(x? log? x). 


It will be seen that even this powerful hypothesis gives only a poor 
result if g is larger than x?. 
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SIEGEL’S THEOREM 


Siegel’s theorem,! in the first of its two forms, states that: 
For any & > 0 there exists a positive number C,(&) such that, if x 
is a real primitive character to the modulus q, then 


(1) Lit, x) > Cy(e\q 
This implies, by (15) and (16) of §6, that 
(2) h(d) > C,(e)\d\?~* ford <0 
and 
(3) h(d) log yn > C,(e)d?~* for d > 0, 


where 4 = 4(to + Uo./d) and fo, UW have the same meaning as in §6. 
In its second form, the theorem states that: 
For any & > 0 there exists a positive number C,(¢) such that, if x is 
any real nonprincipal character, with modulus q, then L(s, x) 4 0 for 


(4) $3 b= Cag. 


The second form follows easily from the first, by virtue of the 
inequality 
L'(s, 7) = O(log? q) 
for 1 — I/logg < s < 1, which was (11) of §14. Ifq is large, as we may 


suppose, then a zero # of L(s, x) satisfying (4) will lie in the interval 
just specified, and it will follow that 


LU, x) = Ll, x) — LIB, x) < ¢,(log? q)C3(e)q~*, 


which contradicts (1) if we there replace ¢ by 3¢. This proves the 
second form of the theorem (assuming the first) when x is primitive, 
and this suffices to prove it when x is any real nonprincipal character. 


1 Acta Arithmetica, 1, 83-86 (1935). 
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It follows that any real zero B of L(s, x), for real nonprincipal y, 
satisfies 
(5) Bs 1 — Cyle)q™, 
and this is a much superior estimate, in principle, to any we have had 
hitherto. It has, however, the disadvantage of being noneffective, 
in the sense that it is not possible, with existing knowledge, to assign a 
numerical value to C,(«) for a particular value of « (for example, 4). 

Siegel’s theorem was the culmination of a series of discoveries by 
several mathematicians. The problem of proving that h(d) > oo 
as d > —o, or even of proving that h(d) > 2 if —d is sufficiently 
large, was propounded by Gauss, but no progress toward its solution 
was made until modern times. Hecke? proved that if the inequality 
fp < 1 —c,/logq holds for the real zeros of L functions formed with 
real primitive characters, then h(d) > c,{d|*/log |d|. In particular 
this conclusion would follow from the generalized Riemann hypo- 
thesis. 

In 1933, Deuring* proved the unexpected result that the falsity of 
the classical Riemann hypothesis for C(s) implies that h(d) > 2 if —d 
is sufficiently large, and shortly afterward Mordell* proved that this 
assumption also implies that h(d) > oo as d > — oo. Their work was 
based on a study of the behavior, as d + — 0, of 


vd Ax, y)§, 
Q x.y 
where Q runs through a representative set of forms of discriminant d. 
In 1934, Heilbronn? took a further important step forward. He 
proved that the falsity of the generalized Riemann hypothesis 
implies that h(d) - 0 as d > —oo. Combined with the result of 
Hecke, this gave an unconditional proof that h(d) > oo, and so 
solved Gauss’ problem. 
Also in 1934, Heilbronn and Linfoot® proved that there are at 
most ten negative discriminants d for which h(d) = 1. As nine such 
d were known, 


—3, —4, —7, —8, —11, —19, —43, —67, — 163, 


the question was whether there is a tenth such discriminant. If there 
were, then the LZ function L(s, 73) would have a real zero # larger 


? See Landau, Géttinger Nachrichten, 1918, 285-295. The same argument allows 
one to deduce the first form of Siegel’s theorem from the second. 

3 Math. Zeitschrift, 37, 405-415 (1933). 

* J. London Math. Soc., 9, 289-298 (1934). 

> Quarterly J. of Math. 5, 150-160 (1934). 

© Quarterly J. of Math., 5, 293-301 (1934). 
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than 4. In 1966, Baker’ and Stark® proved independently that there 
is no such tenth discriminant. Baker noted that his fundamental 
theorem in transcendence theory provides a solution of this class 
number problem in view of earlier work of Gelfond and Linnik. 
Stark was inspired by a paper of Heegner?® in which elliptic modular 
functions were used to show that there is no tenth discriminant with 
class number 1. It was long thought that Heegner’s argument was 
incomplete, partly because it seemed to depend on an unproved 
conjecture of Weber. However, in retrospect it has now been found 
that Heegner’s proof is essentially correct; the obscure details have 
been clarified by Deuring’® and Stark!?. 

Baker and Stark have found’? all quadratic discriminants d < 0 
for which h(d) = 2, but for h(d) = 3 the problem of finding all such d 
is still open. The fact that it has not been possible to find all such d, 
or to reduce the problem to one of computation, reflects the fact 
that the more powerful arguments that have been developed for this 
problem are of an indirect character. 

We shall now prove Siegel’s theorem, in the form first stated, using 
the simplified method given later by Estermann.!? The basic idea is 
to combine the L functions of two characters. Let y,, 7, be real 
primitive characters to the distinct moduli q,, q,; as we saw in §14, 
%1X%2 1S a Nonprincipal (though not necessarily primitive) character 
to the modulus q,q). Let 


(6) F(s) = C(s)L(s, ,)L(s, x2) L(s, %1%2)- 


Then F (s) is regular in the whole plane except for a simple pole at 
s = |, and its residue at this pole is 


(7) A= LI, x )LG, 72)L0, 41%). 
An essential part in the proof is played by the inequality 


Cah 


(8)  F(s)>4—- (g@aq5 Ph" ® forg<s<l. 


Ls 
An inequality of the same general character as (8) was used by 
Siegel, and was deduced by him from the work of Hecke on the 


7 Mathematika, 13, 204-216 (1966). See also Chapter 5 of Baker, Transcendental 
Number Theory, Cambridge University Press, 1975. 

8 Michigan Math. J., 14, 1-27 (1967). 

° Math. Z., 56, 227-252 (1952). 

1° Invent. Math., 5, 169-179 (1968). 

"1 J. Number Theory, 1, 16-27 (1969); Modular Functions of One Variable I, 
Springer-Verlag, Berlin, 1973, pp. 153-174. 

‘2 Ann. Math., 94, 139-152, 153-173 (1971). 

‘3 J. London Math. Soc., 23, 275-279 (1948). Other simple proofs have been given 
by Chowla, Annals of Math. (2) 51, 120-122 (1950) and by Goldfeld, Proc. Nat. Acad. 
Sci. U.S.A., 71, 1055 (1974). 
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functional equation of the Dedekind ¢ function of an arbitrary 
algebraic number field. The function F(s) is essentially the Dedekind 
¢ function of a biquadratic field. A simple proof of Siegel’s inequality 
was given by Heilbronn,'* but even this requires some knowledge of 
algebraic number theory and contains some complications of detail. 

Estermann’s proof of (8) is relatively simple. The multiplication of 
the Euler products gives 


for o > 1, where a, = | and a, > 0 for all n. The last fact follows 
from 


Lom tp ™ EL + xalp™)][L + x2(P™)) 


where the coefficients are obviously nonnegative. As in de la Vallée 
Poussin’s argument in §4, we obtain 


F(s)= ¥ b,(2 — s)™ 


for |s — 2| < 1, where by > 1 and b,, > 0 for all m. 
It follows that 


(9) F(s) — A/(s — 1) = Ye A)(2 — sy", 


and this must be valid for |s — 2| < 2, since the left side is regular 
there. 

On the circumference of the circle |s — 2| = 3, the function C(s) 
is bounded, and the L functions satisfy 


IL(s, X1)] < 541, IL(s, X2)1 <¢sq2, IL(s, %1%2)1 < 54142, 
by (14) of §12. (This inequality was proved for any nonprincipal 
character, whether primitive or not.) Hence 

[F(s)| < ¢6q143 


on the circumference, and the same applies to 4/(s — 1), since / is the 
product of three L functions which satisfy the above inequalities. It 
follows from Cauchy’s inequalities for the coefficients of a power 
series, applied to the function (9), that 


(10) Ibm — Al < 2¢6q3q3(3)”. 


14 Quarterly J. of Math., 9, 194-195 (1938). 
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For 3 < s < 1, we have 
> bn — A(2 — 8)" < ¥ 2e6q7q314(2 — 5)" 
m=M m=M 


< 2ceqiqs > (}" 
m=M 


< ¢qigi@)" < crqiqze ™*. 
Hence, in this interval, 
M-1 
F(s) —AM(s — 1) >1-A ¥ (2— s)™ — crqigqze ™* 
m=0 
2—s”-1 
=1- sS—9 =! — enqiqae "4. 


We choose M to satisfy 


be" < cxaigde M4 <4 
and obtain 
1 A M 
P(s) > 2 - 7 (2 — 5) : 
Since 
4M < 2logg,q, + Cg, 
so that 
M <8 log qiq2 + co, 
we have 


(2 — sy“ = exp[M log(1 + 1 — s)] < exp[M(1 — 5)] < cyo(qyq2)h 


This proves (8). 

To deduce Siegel’s theorem, we distinguish (following Estermann) 
two cases, the distinction depending on the given positive number ¢. 
If there is a real primitive character for which L(s, x) has a real zero 
between | — jkeand 1, we choose x, to be such a character and f, to 
be the zero in question. Then F(f,) = 0, independently of what 
72 May be. If there is no such character, we choose x, to be any real 
primitive character and B, to be any number satisfying 1 — jc < 
B, <1. Then F(B,) < 0, independently of what 7, may be, for 
C(s) < O when 0 < s < 1 by (7) of §4, and the three L functions in (6) 
are positive when s = | and do not vanish for 8, < s < 1. Thus in 
either case F(P,) < 0, and the inequality (8) gives 


C4A > (1 = By(diq2) 878”. 
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From now on we keep x, and f3, fixed. Let y, be any real primitive 
character to a modulus q, > q,. It follows from (13) of §14 that 


A < (ey, log q,)LQ, 72)(cy 1 log q1q2). 
Hence 
L(l, x2) > Cqz ®" "(log qo) *, 


where C depends only on ;,, and therefore only on ¢. Since 
8(1 — B,) < 4e, the last inequality implies (1) if q, is sulliciently 
large (as we may suppose). This establishes Siegel’s theorem. 
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THE PRIME NUMBER THEOREM 
FOR ARITHMETIC 
PROGRESSIONS (I1) 


By appealing to Siegel’s theorem we can obtain a better approxi- 
mation to W(x; q, a) than was possible in §20. 
If we suppose that 


(1) q = exp[C(log x)*] 
for some positive constant C, then (8) of §20 tells us that (for y # x) 


Bi 
W(x,y= - a + O{x exp[—C’(log x)*}t, 


1 


where C’ is a positive constant depending only on C. Here the term 
in B, occurs for at most one real character (mod q). Siegel’s theorem 
states that for any ¢ > 0 there exists C,(¢) such that 


By <1—-—C,(eq™*. 
Hence 
xt < x exp[—C,(e)(log x)q~‘]. 


In order that this expression may be small compared with x, we 
must impose a more severe restriction on q than that expressed by 
(1). Suppose that 


(2) q < (log x)", 


for some positive constant N. Then, on taking. ¢ = (2N)~', we get 
q° < (log x)*, and 


xP < x exp[—C,(N) (log x)*]. 
Thus, subject to (2), we have 
(3) W(x, 2] < x exp[ —C3(N)(log x)*], 


for any nonprincipal x (mod q). 
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Substituting in (3) of §20, we obtain the following result for 
W(x; q, a), which represents the best form so far known of the prime 
number theorem for arithmetic progressions.' 

Let N be any positive constant. Then there exists a positive number 
C3(N), depending only on N, such that if q satisfies (2) then 


x 


Ma O{x exp[—C3(N)(log x)*]} 


(4) W(X5q, a) = 


uniformly in q. 

The various results for w(x; q, a), which have been found in §20 
and here, have analogs for (x; q, a), the number of primes up to x 
that are congruent to a(mod q). These are derived by partial summa- 
tion, as in §18. The main term is now (li x)/@(q) in place of x/@(q), 
and the error terms are all reduced by a factor log x. But the latter 
change is of no significance except for the analog of (14) of §20, which 
was based on the assumption of the generalized Riemann hypothesis. 

As we have seen, the main difficulty in approximating to W(x; q, a) 
arises from the term containing f,. But if this term is retained, so 
that one is prepared to accept a result of the form 


gee. MO y 
~— @q) Hq) B: 


where y, is the possible real character with the exceptional zero f,, 
further progress is possible. The error term then comes essentially 
from 


(5) W(x; q, a) 


O(...) 


1 
gaye /p\, 


and here it is not essential to have a good estimate for the real part 
of each p, provided one can handle the above average over the $(q) 
characters. Results in this direction can be based on estimates for 


(6) > N(@, T, x), 


where N(a, T, x) denotes the number of zeros of L(s, y) in the rectangle 
a<o<l, \t| < T. 


Such estimates were obtained? by Rodosskii and Tatuzawa, building 
upon work of Linnik, and as a consequence they were able to obtain 


' This application of Siegel’s theorem to primes in arithmetic progressions was 
made by Walfisz, Math. Zeitschrift, 40, 592-607 (1936). 
? For an account of their work, see Prachar, Chap. 9. 
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an improved error term in (5), or alternatively the same error term 
for a longer range of q. 

The value of a result of the type (5) is mainly in connection with 
the distribution of primes in a relatively short segment of an arith- 
metic progression. When such a formula is applied with two values 
of x that are not far apart, and the results subtracted, the terms 
containing f, largely cancel. 

The methods for estimating the sum (6) are based to a considerable 
extent on earlier work? by a large number of mathematicians on 
the estimation of N(a, T), the number of zeros of ¢(s) in the rectangle 
a<xo<10<t<T. 


From now on, we shall be concerned primarily with the proof of an 
estimate for 


W(x; 4, a) — x/(q), 


not for an individual value of q but on the average over q up to a 
certain bound. Such results are obtained by the “large sieve” 
method, which we discuss in §27. 

The first result of this general nature was given by Rényi. He 
proved‘ that, for primes g < (7;N)*, the inequality 


5q, a) — (li ae) aay 
|x(x 5 q, a) — (li x)/(q)| « gt log N 
holds, apart from certain possible exceptional pairs q, a; the number 
of exceptional q is < N* log N, and the number of exceptional a 
for each g is < q?. 

In §28 we shall prove the following simple and far-reaching 
result of Bombieri: For any positive constant A, there exists a 
positive constant B such that 

> max max |W(y;4, 4) — y/d(q)| « x(log x)” “ 


q<X (4.q)=1 ysx 


X = x*(log x) ?. 


The form of the inequality, with two maxima, may at first sight seem 
complicated, but it is one that is very convenient for applications. 


3 See Titchmarsh, Chap. 9. 
“ Compositio Mathematica, 8, 68-75 (1950). 
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THE POLYA-VINOGRADOV 
INEQUALITY 


Suppose that y is a nonprincipal character (mod q). Since 
y4-, x(n) = 0, it is clear that ) 4", , x(n) < q for any M and N. 
However, a sharper bound is needed to describe the distribution of 
power residues within the interval 1 <n < q. In 1918 Polya’ and 
Vinogradov’ proved independently that 


M+N 


(1) » xn) <qtlog q 


n=M+1 


for nonprincipal characters y (mod q). By taking y(n) = (n|p), we 
deduce that the interval M+ 1<n<M+N contains4N + 
O(p* log p) quadratic residues (mod p). The Pélya-Vinogradov 
inequality will be used in our arguments of §28. 

Polya considered the sum )),< xq X(n) as a function with period 1, 
and determined its Fourier expansion. The Fourier expansion is 
not absolutely convergent, and so does not immediately provide a 
proof of (1), but Polya also derived a truncated expansion which 
suffices. Pélya’s analysis is fundamental to more detailed investiga- 
tions, but for our purposes an elementary argument of Schur? suffices. 

We first prove that 


M+N 


Y x(n) 


n=M+1 


(2) < q'log q 


for primitive characters y (mod q), g > 1. In §9 we saw that for such x 
and any n, 


x(n) = ae y tare ") 


' Géttinger Nachrichten, 1918, 21-29. 
2 Perm. Univ. Fiz.-mat. ob-vo Zh., 1, 18-28, 94-98 (1918). 
3 Géttinger Nachrichten, 1918, 30-36. 
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Hence the sum in — is 
M+N 
x(a) 
@ ae n= 2° 1 (2 ") 


Here |t(z)| = q'/*, and the inner sum is a geometric series with sum 


= (™ =f 5N + 2) sin mNa/q 


q sin ta/q — 
Consequently 
M+N q-1 1 
(nso?) 
Lie : » |sin za/q| 


For convex functions f(a), 
até 


fe <5 | fas. 
a—46 


Taking f(«) = (sin ma)" 1, 6 = 1/q, we see that the sum above is 


Bs af sin nB)~' dB = 2q [ cin np)! dp. 
34 


34 


Now sin 2B > 28 for 0 < B < 4, so that the above is 


2 dp 
< m4 [ x5 = q log q. 
4q 2p 
Hence we have (2) for primitive y. 
Suppose now that x is a nonprincipal character (mod q), induced 
by the primitive character y,(mod q,). Then q,|qg, and we write 
q = qr. Hence 


M+N M+N 
> «<= Y ut) 
n=M+1 n=M+1 
(n,r)=1 


Now )i4jn H(d) = 1 or 0 according as n = 1 or n> 1, so that the 
above is 


M+N M+N 
~ ur)yud@ =Yu@ Y uM) 
n=M+1 =i d|r ree 


= » u(d)y,(d) >: X1(m). 


(M+1)/d<m<(M+N)/d 
In view of (2), the inner sum has modulus <q}’” log q,, so that 


M+N 


> xn) 
n=M+1 


< qi(log 4) 2 |u(d)| = 2° gflog qy. 
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But 2°” < d(r) <r* for any ¢ > 0, and in particular for ¢ = 4, 
which gives (1). In fact we can obtain a good numerical constant 
by noting that 


dr) =Y1<2Y1<2/%. 


d|r d|r 
dsr 
The inequality (1) is close to best possible, for Schur also proved 
that 
1 
max | >) x(n)| > ay Ja 
N |n<N nt 


for all primitive x (mod q). In 1932 Paley* showed that 


x() 


for infinitely many quadratic discriminants d > 0. In the opposite 
direction Montgomery and Vaughan? have shown recently that, 
assuming the generalized Riemann hypothesis, 


max > 4,/d log log d 


N 


M+N 
y x(n) < \/q log log g 
n=M+1 
for all y # Xo(mod q). Although (1) is close to being best possible, 
for many purposes it is useful to have an estimate which is sharper 
when N is small compared with q; Burgess® has made some progress 
in this direction. 


* J. London Math. Soc., 7, 28-32 (1932). 
> Invent. Math., 43, 69-82 (1977). 
© Proc. London Math. Soc., (3) 13, 524-536 (1963). 
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When f is monotonic we can use the prime number theorem and 
partial summation to estimate )’,<y f(p). For certain multiplicative 
functions, namely those of the form f(n) = x(n)n*, we can estimate 
> p<nJ (p) by using the zero-free region of L(s, x). In 1937 Vinogradov 
introduced a method for estimating sums )’,<y f(p) in which f is 
oscillatory but not multiplicative.’ His starting point was a simple 
sieve idea. Let P = [|,<y+p. For n in the range 1 <n <N the 
sieve of Eratosthenes asserts that (n, P) = 1 if and only ifn = lorn 
is a prime number in the interval N* < n < N. Hence 


fY)+ VY fO= px f() = YL HO) 2, (0. 


NB “<p<N r<Nit 
(n, ne 1 ee 


Thus we are led to bound sums of the kind )),<.y,, f (rt). We need to 
show that these sums are small. However, we cannot hope to get 
much cancellation when t is nearly as large as N, for then the sum 
contains few terms. Therefore Vinogradov rearranged the terms 
arising from t|P,6N < t < N, but this entailed great complications. 
Recently Vaughan? found a new version of Vinogradov’s method in 
which the details are much simpler. 
Following Vaughan, we let 


F(s)= ) A(mym™*, G(s) = Yu@ a, 
m<U d<V 
and we note the identity 


_ Os) © 


1 
() (5) 


— O(s)F(s)G(s) — ¢'(s)G(s) 


+ (- 22 - Fen)-a - «aoe, 


‘See Chapter IX of Vinogradov, The Method of Trigonometrical Sums in the 
Theory of Numbers, Interscience, London, 1954. 
2 C.R. Acad. Sci. Paris, Sér A, 285, 981-983 (1977). 
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valid for o > 1. Calculating the Dirichlet series coefficients of the 
four functions on the right-hand side, we see that 


A(n) = a,(n) + a,(n) + a3(n) + a,(n), 


where 
_ fA) ifn <U, 
a(n) = 0 ifn>U; 
a,(n)= — > A(m)x(d); 
mv 
d<V 
ax(n) =), H(d) log h; 
ae 
and 


mk=n 
m>U 
k>1 


a,(n) = — du a as 


d<V 


We multiply throughout by f(n) and sum; then 


Y FMA) = S, + 8, + S3 + Sa, 


n<N 


where 


S,= ¥ f@a(n). 


n<N 


In applications we shall bound S, trivially; the remaining S; are 
treated individually. 
We write S, in the form 


S.=- Y [ Y u@Am)\ Y f(t). 
t<UV] md=t r<N/t 

m<U 

d<V 
Again we have a linear combination of the sums ),<yj f (rt), but 
now we can control the range of t by ensuring that UV is sub- 
stantially smaller than N. As )'nj, A(m) = log t < log UV, we see 
that 


(2) S, < (log UV) > 


t<UV 


» f(t) 


r<Nit 
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The sum S, is of the same form, since 


ae u(d) >, f(dh) log h = Py ud) > f Fra [ —- 


h<Nid h<Nid 


3) = of Lua » ran) ™ 


wsh<N/d 


< (log N) )) max 
dev 


w 


f (dh) 


se, 


The sum S, has a more complicated shape. We note that 


> Hd) = 
tev 
for 1 < k < V,so that 
Ss= YL Am™m Y (ii a) f (mk). 


U<m<N/V V<k<N/m\ d|k 
d<V 


Suppose that A = A({, M, N, V) is such that 


2M 4 4 
< 4(S lb") ( y al) 
M k<N/M 


for any complex numbers b,,, c,; such bilinear form inequalities are 
familiar, and we have means of estimating A. Thus 


(4) yom dxf (mk) 


M<m<2M V<k<Nim 


S, <(log N) max a(S acm? ) ( y aw). 


U<M<N/V k<N/M 


Here the sum over m is estimated by noting that 


y; A(m)* < (log z) ©) A(m) < z log z. 


msz msz 


As for the sum over k, we observe that d(k)? =) 4), f(d), where 
f(a) is the multiplicative function for which f(p*) = 2a + 1. Thus 


2 dk = YY f@= PY f@lz/4] 


k<z d|k 


<zV f@as<z[]d+f@M/p+f@yp? +--+) 


d<z psz 


-3 
<z|] (1 — ;) < z(log 2z)*. 
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Combining these estimates, we see that 


(5) S, < N#(log N)> max A. 
U<M<N/V 

To be more specific we now suppose that | f(n)| < 1 for all n. Then 
the estimate ),<y f(n)A(m) < N is trivial, and we seek a sharper 
estimate. Clearly S; < U. From (2) we obtain the trivial estimate 
S, < N(log UV)?; hence we do not require much cancellation in the 
sums )';<yj-f(rt) to show that S, = o(N). Similar remarks apply 
to S;. For S, we obtain a trivial bound for A by applying Cauchy’s 
inequality: 


4 
oy Dm 2 C, < (m. x (x bal?) ( > la) : 
M<ms<2M V<k<N/M k<N/M 


Hence the gstimate A < N? is trivial, which in (5) gives S, < 
N(log N)?; ae we need only a slightly sharper bound for A. 
Note however that no such improvement is possible if f is totally 
multiplicative and unimodular, since we may take b,, = f(m), 
c, = f(k). For this reason the principal applications of the method 
involve functions f which are not multiplicative. 

For most f we are not able to determine the least A for which (4) 
holds. However, the following approach is very useful: By Cauchy’s 
inequality, the left-hand side of (4) is 

ii 


< (Sb. aie 
M |V<k<N/m 
Here the second sum over m is 
c¢ » & YY fim) f mb). 
V<j<sN/M "Vy <kEN/M M<m<2M 
m<N/j 
m<N/k 


We note that |c,é,| < |c;l? + 41c,|7; hence the above is 


€:- Dy lle 2 


V<j<N/M V<k<N/IM 


: > ; Femi) Femk)|. 
nena 
msN/k 


Thus 


: » au (mj) f (mi 
msNjj 
msN/k 


V<jsN/M V<k<N/M 


“| max 


This bound is largest when f = 1, and then we obtain again the 
trivial bound A « N?. If f is totally multiplicative and unimodular 
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the bound is unchanged, but otherwise we may expect some cancel- 
lation in the inner sum, and hence a nontrivial bound for A. 

Combining our estimates, we see that if | f(n)| < 1 for all n, 
U>2,V >2,UV <N, then 


(6) oy f(n)A(n) « U + (log N) y max yy f(rt) 
n<N t<UV- w w<r<Nijt 
+ N3(log N)?> max max ( > =f (mj) f(mk) 
U<M<N/V V<j<N/IM\V<k<N/M M<m<2M 
m<N/j 


In conclusion we remark that in some situations sharper estimates 
can be obtained by treating S, more carefully. Write 


S=>} =¥+ DY =8,4+85. 


t<UV t<U U<t<UV 


Then treat S, as we did S,, and S; as we did S,. 
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AN EXPONENTIAL SUM FORMED 
WITH PRIMES 


Vinogradov first used his method to estimate the important sum 


S(a) = ¥ A(Me(na). 
n<N 
We now use our general estimates of the previous section to bound 
S(a). We find that our results depend on rational approximations 
to a: If 


a 1 
1) a—-|<-—, (aq) =1, 
( q|°¢ (a, 4) 
then 
(2) S(a) < (Nq~? + N? + N#q?)(log N)*. 


To prove this we first note that 


e((N2 + 1B) — (Ni 8) 
e(B) — 1 


Here ||f|| denotes the distance from f to the nearest integer. Hence 


N 1 
< min{ — , —— ]. 
»X ( rai) 


We assume for the moment that this latter expression is 


¥ (nf) = < min(W, — Nog 


> e(rta) 


>, max 
| wesrsNit 


t<T 


N 
(3) < (* + T+ i} 2qT 
for « satisfying (1). Then the upper bound (6) of §24 gives 


N 
Sa) <U + (= + UV + a)(oe 2qN)? 


1 4 
+ N'/*(log N)? max max ( min(m, ea) 
U<M<N|V V<j<N/M ele (kK — joel] 
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This last term is 


N 1 \\? 
< N?(log N)®> max (m + min(* =), 
moe) U<M<NI/V nee m ~ ||ma|| 


and by (3) again this is 
<N’*(log N)> max {M+—+—+q] (log qN) 
U<M<N/V M q 
<(NV~* + NU? + Nq? + N#q?)(log qN)*. 
Hence altogether we have 
S(a) <(UV +q + NU=* + NV? + Nq™? + N#q*)(log qN)*. 


The estimate (2) is trivial if q > N, so we may assume that g < N. 
Then we obtain (2) by taking U = V = N?. 

It remains to establish the estimate (3). Write t = hq + r with 
1 <r<4q,and put B = a — a/q. Then 


zoo) 3, Sma 


t<T O<h<T/q r=1 


,|lra/q + hgB + rB i-*} 


We consider first those terms for which h = 0,1 < r < 4g. For such 
r we have |rB| < 1/2q, so that the contribution of these terms is 


1 
<< Ira] TS 4 108 4: 


ra 
q 2q 


For all remaining terms we have hq + r > (h + 1)q. Let h be given, 
and let I be an interval in [0, 1] of length 1/g. There are at most 4 
values of r, 1 <r < q, for which 


1l<r<q/2 


; 4+ hgB + rB eI (mod 1). 


Hence 
ae gp +r 
neers pee (h + Dq ag 1)q’ 
N 
< ——— + qlog - 
ae + Iq 


< &¢ +T+ a) 2qT, 
q 


and the proof is complete. 

One may note that our estimate (3) is sharp, even in the special 
case « = a/q, but that if the hypothesis (1) is weakened then the 
bound (3) must be correspondingly weakened. 
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SUMS OF THREE PRIMES 


Hardy and Littlewood’ showed, assuming the generalized 
Riemann hypothesis, that every sufficiently large odd number is a 
sum of three primes. In their argument, the hypothesis was required 
to provide estimates corresponding to our estimates of S(«) in §25. 
In 1937 Vinogradov’ used his new estimates to treat sums of three 
primes unconditionally. Instead of considering the number of 
representations of n as a sum of three primes, we deal with the related 
quantity 


r(n) = )) A(ky)A(k2)Atk3), 


where the sum is extended over all triples k,, k,, k3 of numbers for 
which k, + k, + k; =n. Thus r(n) is a weighted counting of the 
number of representations of n as a sum of three prime powers. 
In additive questions it is appropriate to use a power-series generat- 
ing function or exponential sum. Taking 


S(a) = ), A(kKjelka), 
k<N 


we see that 


S(a)? = i r'Me(no), 


where r'(n) is defined in the same way as r(n) but with the further 
restriction that the k; do not exceed N. Hence r(n) = r(n) for 


n<N. As S(«)° is a trigonometric polynomial, we can calculate 
r(N) by the Fourier coefficient formula 


(1) r(N) = [seo —n da. 
0 


' Acta Math., 44, 1-70 (1922). 
2 Mat. Sb., N.S. 2 (O.S. 44), 179-195 (1937). 
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We shall find that the integrand is large when « is near a rational 
number with small denominator; by estimating the contributions 
made by these peaks, we prove the following: 


THEOREM (Vinogradov). For any fixed A > 0, 
r(N) = 4+6(N)N2 + O(N?(log N)~ 4), 


209 =(I1(\- gap) +=) 


The above is of little use when N is even, for then S(N) = 0, at 
least one of the k; is a power of two, and hence r(N) < N(log N)*. 
However, if N is odd, then G(N) ~ 1, and hence r(N) > N? for all 
large odd N. The contribution made to r(N) by proper prime powers 
is easily seen to be <N?(log N)?; hence all large odd N can be written 
as a sum of three primes in > N7(log N)~ 3 ways. 

We now divide the range of integration in (1) into subintervals 
for detailed treatment. Let P = (log N)®, Q = N(log N)~®, where B 
will be chosen later in terms of A. Forg < P,1 <a <4q,(a,q) = 1, 
let M(g,a) denote the interval |« — a/q| < 1/O. Here we are 
considering the real numbers modulo 1, so that 1, 1) can be 
thought of as the interval |«| < 1/0. Let Mt denote the union of 
these “major arcs.” We note that two major arcs t(qg, a) and 
Mq', a’) are disjoint if a/q # a’/q’, since 


where 


sleds 2 
~ qq P?” Q 


We let m (standing for “minor arcs”) denote the complement in 
[0, 1] of Me. 

We now estimate the contribution of the major arcs to the 
integral (1). To this end we first determine the size of S(«) for 
a € Wi(q, a). We easily see that 
Gia [oe if(h, @) = 1, 


i ie if (h, q) > 1. 


Hence if (a, gq) = 1, « = a/q + B, then 


es A(kje(ka) = 


in Be 1 


i 52 > Uyx(a) > WKACKeCKB), 
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so that 


(2) S(a%) = eaxCa) 2, xCK)ACeCkB) + O((log N)?). 


1 
@ & 


It is easy to verify that the inner sum here is 
N 

3) =e(NB WON, x) ~ 2nif { eluB Mu, 2) du 
1 


If y ¥ Yo, then by estimate (3) of §22, the above is 


<(1 + |B|N)N exp(—c./log N). 
To treat Xo, we let W(u, Xo) = [uv] + R(u), and we put T(f) = 
y <n (kB). Again it is easily seen that 


T(B) = NPN — 2ni8 | e(np)[u] du. 


By subtracting this from (3) we find that 


N 
YAH eB) = TB) + WARN) ~ 2B | eCuB RCW du 


= T(B) + O(1 + |BIN)N exp(—c./log N)). 


In §9 we saw that t(7) = u(q) and that |t(x)| < q? for any y(mod q). 
On combining these estimates in (2) we conclude that 


S(a) = ae T(B) + O((1 + [BIN)g#N exp(—c/Tog N)). 


But q < P and |f| < 1/Q for a€ Mi(q, a), so that 


(a) = “ T(B) + O(N exp(—c1/tog N)) 


for « € Ni(q, a). Consequently 


s(a)? = a T(B)? + O(N? exp(—c,/iog N)), 


and hence the contribution of M(qg, a) to the integral (1) is 


ula) e(- =) | ‘a T(B)2e(—NB) dB + O(N? exp(—c,,/log N)). 
$(q) q /J-1/0 
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Summing over the various major arcs, we see that 


u(q) He 5 
(4) | S(aee(—Na)da= YS <8) | T(B)°e(— NB) dB 
M q<P (9) -~1/Q 


+ O(N? exp(—c3,/log N)), 


where c,(n) is Ramanujan’s sum, 


We now estimate the integral and the sum occurring on the right- 
hand side of (4). The sum T(f) is a geometric series with sum 


e((N + 1)B) — 1 


AB 1 < mint, IIB". 


Hence 
1-1/Q 
{ IT(B)E 4B < O? < N°(log N)-28, 
1/Q 


so that 
1/9 1 

| T(B)°e(— NB) dB = | T(B)°e(— NB) dB + O(N*(log N)~ 7). 
-1/Q 


The integral on the right is equal to the number of ways of writing N 
in the form N = k, + k, + k3, and this is 


4(N — 1)(N — 2) = 4N? + O(N). 


Hence 
(3) [" T()e(—NB) df = 4N* + O(N*(log N)-*4) 


To deal with the sum in (4) we first evaluate Ramanujan’s sum 
c,(n). Grouping residue classes a (mod q) according to the value of 
(a, q), we see that 


(2) = y (2) = \c,an). 
a=1 q nar q d\q 


dljq a 
3 (a, = 
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Here the sum on the left vanishes if g does not divide n, and is equal 
to q if q divides n; thus by M6bius inversion, 


(6) c{n) =» au(4 ) 


a 


It is now clear that c,(n) is a multiplicative function of g for any 
fixed n. Let p* be the highest power of p dividing n. Then c,s(n) = 
d(p*) for B <a, Cpe+i(n) = —p*, and cye(n) = 0 for B>a +1. 
Hence 


_ wan, 4))¢(q) 
@ cD = 9G.) 


From the trivial estimate |c,(n)| < #(q) we see that 


Hq) ~B+1 
ras < 2, 3@? my eee) 


The sum in (4), when extended over all g, can be written as an 
absolutely convergent product, 


2 Wg) c,(N) 
» gs) =H (1 "= P| 


1 1 
| a einer See 
nt oe ») n( Ge ») 
S(N), 


ll 


so that 
» oo c(N) = S(N) + O(log N)~?*?). 
qasP oq ) 
We combine this with (5) in (4) to see that 
(8) [ S(a)%e(—Na) da ="$S(N)N? + O(N>(log N)-#*), 
mM 


To complete the argument we must show that the minor arcs 
contribute a smaller amount. We note that 


If S(a)%e(— Na) des =< (max|s(a)| )f |S(a) |? da 


< (max(s(a)|) [° |S(@)/? da. 
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This last integral is 


y Aki) » Na) e((ky — ky)a) da = } A(k)? «< N log N. 
k2<N 0 


kisN k<N 


Dirichlet’s theorem on Diophantine approximation asserts that for 
any real « and any real number Q > 1, there is a rational number a/q 
such that |« — a/q| < 1/qQ,1 <q <Q, and (a,q)=1. If g < P, 
then « € W(q, a); hence if «em, then P < q < Q. That is, for each 
a em we have a/q with |« — a/q| < 1/qQ < 1/q?, (a,q) = 1, and 
P <q < Q. Hence by our estimates in §25, 


S(a) < N(log N)~ &/2)+4 


for « € m, and therefore 
{ S(a)?e(— Na) da < N?(log N)~8/2)*5, 


This with (8) gives the desired result, on taking B = 2A + 10. 
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THE LARGE SIEVE 


The large sieve was first proposed by Linnik’ in a short but 
important paper of 1941. In a subsequent series of papers, Rényi 
developed the method by adopting a probabilistic attitude. His 
estimates were not optimal, and in 1965 Roth? substantially 
modified Rényi’s approach to obtain an essentially optimal result. 
Bombieri? further refined the large sieve, and used it to describe the 
distribution of primes in arithmetic progressions; this we shall 
discuss in the following section. 

Rényi’s approach to the large sieve concerns an extension of, 
Bessel’s inequality. We recall that Bessel’s inequality asserts that if” 
1, O2,.-., pz are orthonormal members of an inner product space 
V over the complex numbers, and if € € V, then 


R 
2 IG dF < ie 


In number theory we frequently encounter vectors which are not 
quite orthonormal. Thus, with possible applications in mind, we 
seek an inequality 


(1) yy I o,)I? < AlSI7, 


valid for all &, where A depends on @,,..., @g; we hope to find that 
A is near 1 when the @, are in some sense nearly orthonormal. 
Boas* has characterized the constant A for which (1) holds: The 
inequality (1) holds for all € if and only if 


2) a Uti, .) < AY, Iu? 


mn In 
A A 


1 Dokl. Akad. Nauk SSSR, 30, 292-294 (1941). 
2 Mathematika, 12, 1-9 (1965). 

3 Mathematika, 12, 201-225 (1965). 

+ Amer. J. Math., 63, 361-370 (1941). 
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for all complex numbers u,. To see this, suppose first that (2) holds. 
Then 


R 2 R 
0< | 5) ue,| = ISI? — LTE 4) + V weld,, oy), 


and by (2) this is 


r=1 r=1 


We now take u, = (6, @,)/A, and then the above simplifies to read 


0.< 161? - 2 Y 1G oI 


which gives (1). We note that if the @, are orthonormal then equality 
holds in (2) with A = 1, and then our argument reduces to the usual 
proof of Bessel’s inequality. 
To demonstrate the converse, we assume that (1) holds for all &, 
and we take § = )._, u,@,. Then the left-hand side of (2) is 
R 
Is? = Lae, o,). 


s=1 


By Cauchy’s inequality this is 


R 4/R z 
<(¥ 10] (dI6. o0F) 7 
and by (1) this is 
R + 
<atie( 5 iu) 


We divide both sides by ||&|| and square, to see that 
R 
I§I7< AD lus, 
s=1 


which is (2). 

A great deal is known concerning bounds for bilinear forms such 
as (2); we content ourselves with a simple argument which is not 
always efficient but which suffices here. We note that 


|u,H#,| < 3]u,l? + glus|?3 
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hence the left-hand side of (2) is 


R 
<> Glu,|? + F]us!?)1(b,, bs)1 = > lel? d I, o;)| 


R R 
< (max »; (40) > ele 
r s=1 r=1 
Thus (2) holds with 


(3) A= max I(,, 5)|; 


and we have proved 
THEOREM 1. Let @,, @2,..., Op and & be arbitrary vectors in 
an inner product space V over the complex numbers. T hen 


Y IG oP < Agi? 


where A is given by (3). 

If the @, are orthonormal, then A = 1 in (3), and we see that the 
above includes Bessel’s inequality as a special case. Moreover, if 
the inner product matrix [(,, @,)] is near the identity, then A is 
near 1. 

Rényi applied inequalities such as the above directly to arithmetic 
sequences. One of Roth’s innovations was to begin with exponential 
sums; this yielded vectors which are more nearly orthogonal. 
Following Davenport and Halberstam’, we consider the large sieve 
to be an inequality of the following kind: Let 


M+N 
(4) S(®) = )) a,e(na) 
n=M+1 
where M and N are integers, N > 0, let «,, a2,...,p be distinct 
(mod 1), and let 6 > 0 be such that |x, — a,|| > 6 for r # s. Then 
for arbitrary a,, 


M+N 


(5) y IS) <4 >) |a,?. 

r=1 n=M+1 
Here A is to depend only on N and 6; our first concern is to determine 
how A depends on these two parameters. In passing we note that the 
value of M is irrelevant, since for any K we can put 


K+N 


T(a)= ), au—x+ne(na) = e(K — M)a)S(a), 


n=K+1 


> Mathematika, 13, 91-96 (1966); 14, 229-232 (1967). 
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so that T has frequencies in the range K + 1<n<K+N, and 
| T(@)| = [S(@)]. 
If R = 1 then the situation is particularly simple, for by Cauchy’s 
inequality 
M+N 


(6) [S(@)|? < Ney Ble 


This is best possible, since equality occurs when a, = e(—na) for 
all n. Hence A > N. On the other hand, 


M+N 


R 1 
[) 1s@, + BP 4B =R | |S)? dB =R Y |a,P, 
r=1 0 M+1 
so that there is a B for which 


R M+N 

> |S, + BP? =R > a,l?. 

r=1 M+1 

If OR < 1, we can choose R points separated by at least 6(mod 1); 
hence R can be as large as [6-1] > 6-1 — 1 and we see that A > 
6-1 — 1. These considerations show that the following theorem is 
essentially the best possible. 

THEOREM 2. Let S(a) be given by (4). Then (5) holds with 
A=N +3671. 

Proof. We first observe that in view of (6) we may restrict our 
attention to cases in which R > 2,so that 6 < 4. Also, by our remark 
about the role of M we may assume that M = —[3(N + 1)]; thus 
it suffices to show that 


R 2 


2 


r=1 


K 


>, a,e(ke,) 


k=—-K 


K 
< (2K + 3674)¥ Ja,l? 
-K 


for 6 < 4. We appeal to Theorem 1 with the usual inner product, 


(ob, p) = x bxW,, taking § = {a,b, *}¢2-x and 
g, aa {bge(—ka,)}*%. 


Here the b, are nonnegative, and strictly positive for —K <k < K. 
Then by Theorem 1, 


R K 2 K 
Y | Y age(ke,)| < AY la, |g, 
r=1j]-K -—K 


where 


R 
A= max ) |B(a, — 4,)|; 


r s=1 
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here 
Bia) = yb: e(ka). 


It now suffices to choose nonnegative b, such that b, > 1 for 
—K <k < K, and such that 


R 
(7) Y. |B(a, — a,)| < 2K + 357! 
=1 


s= 


for all r. If we were to take b, = 1 for —K < k < K, b, = 0 other- 
wise, we would obtain the inferior estimate 


R 
d |B, — o,)| < 2K + O(5~! log 5~*). 
s=1 
To obtain a sharper estimate we take smoother b,, namely 


1 if |k| < K, 
m= ft cui mu fK <|k|<K+L, 
0 if|k| >K+L, 


where L is a positive integer to be selected later. To write B(«) in 
closed form we appeal to the identity 


“(sin nJa\? 
Be] ~ (Gee) 


Y VJ = lipeGa) = 


lil<J sin 1% 


firstly with J = K + Land secondly with J = K, for by subtraction 
we then find that 


B(a) = = (sin m(K + L)a)? — (sin tKa)*)(sin ma)” ?. 
Hence B(O) = 2K + L, and 
|B(a)| < + (sin n)-? < (Aba?) 
so that 


R foe) 
>; |B, — o,)| < 2K +L +2 ¥ (4Lh76?)" 1. 
=1 


s h=1 


To evaluate this last term it is useful to know that 


ae? =(Qy=2, 
h=1 6 
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However, for our purposes it is sufficient to note that 
foe) ioe) 
oe a a | +| u-* du = 2, 
h=1 1 
by the integral test. Hence 


R 1 
> |B(a, — #,)| < 2K +L +—,. 
s=1 Lé 


We now let L be the least integer >6~ ', for then the above is 
“2K +6°'+1+4+6°!<2K + 367}, 


since 6 < 4. Thus we have (7), and the proof is complete. 

A. Selberg chose the b, more carefully, and obtained the sharper 
value A = N + 6~! — 1; this and other refinements are found in 
the survey article of Montgomery’. 

Gallagher’ has devised a different approach to the large sieve; his 
method gives A = nN + 67 ' which is sharper than Theorem 2 when 
No is small. We do not need his results, but we describe his method as 
it is very flexible, and can be used to advantage in other contexts. 
If fhas a continuous first derivative in [0, 1] then 


1 x 1 
f@)= | re du + [du +| (u — 1)f'(u) du 


for 0 < x < 1, as we may verify by integrating by parts. Hence 


f@< [ire + 3|f'(u)| du, 


and in general 


If@|< { If] + 1f' WI du 


for 0 < x < 1. After a change of variables the first inequality takes 
the form 


atid 1 


If(@| < | If] + 21 f'@)| du. 


a—+46 6 
We take f(a) = S(«)? and sum, to see that 


» 1SG@,)/? < 2. 
r=1 r= 


ap +46 1 
= |S(a)|? + |S(a)S'(a)| do. 
atr—46 6 


® Bull. Amer. Math. Soc., 84, 547-567 (1978). 
7 Mathematika, 14, 14-20 (1967). 
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The intervals of integration are nonoverlapping and the integrand 


is nonnegative, so the above is 


< ; |, |S(a)|? do + i. | S(x)S’(a) | da. 
0 0 


By Parseval’s identity the first integral is }.477|a,|7; this is easily 
verified by expanding and integrating term-by-term. By Cauchy’s 


inequality, the second integral is 
1 tet 4 
< ({,1s@r as.) ({Lis@or da) 


Again by Parseval’s identity, this is 


M+N M+N 
( by la) ( y nina?) 
M+1 


M+1 


Without loss of generality we may suppose that M = —[3(N + 1)] 
so that |n| < $N forM +1<n<M+4+N.Then the above is 


M+N 


<2N ¥ |a,|?, 
M+1 


and we have the large sieve with A = 6"! + aN. 
In our applications of the large sieve we shall take the points «, to 


be the Farey fractions a/q, (a, q) = 1,q < Q. If a/q and a’/q' are two 
distinct such fractions, then 


a a 


q 4 
hence we can apply the large sieve with 6 = Q~? to obtain the 


()| 


We now use the above result to formulate the large sieve in the 
manner of Rényi. Let VY bea set of Z integers in the interval M+1< 
n<M+N, and let Z(q, h) denote the number of these integers 


which are congruent to h (mod q). Clearly 


> Z(q, h) = Z 
h=1 


qq’ 


inequality 


(8) d, 2 


no 


M+N 


<(N + 30") © lanl’. 


a 
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so that the average of Z(q,h) is Z/q. Rényi considered the mean 
square error, 1.¢., the “variance” 


q Z 2 
V@= > (z0a h) - =} 

h=1 q 
From the large sieve we find that the numbers V(p) are on average 
small; we find that 
(9) Y pV(p) <(N + 3Q”)Z. 

p<Q 
To see this we let a, be the characteristic function of the set 1, so 
that 
S(a) = ¥ e(na). 
neN 


Then 


(Q)/- 2,2 3") 


The innermost sum is =q or 0, according as m = n (mod q) or not; 
hence the above is 


q> Y= 4) Za AF 


meN new 
m=n(mod)q 


Thus when we expand the square in V(q), we see that 


qV(q) =4q 2 24, h)? — 2Z ¥ 24, h) + Z? 


h=1 


ey ed 


But S(O) = Z, so that 


q~-1 a 2 
qVvq= > s(°) 
a=1 q 
and consequently by (8), 
2 
» PV) = 2, > s(°) < (N + 3Q”)Z. 


Using (9) we can now present the large sieve as a sieve in the 
elementary sense. Suppose that from the interval M+1<n<M 
+ N we remove several arithmetic progressions, and we let.4 denote 
the remaining set. For example, suppose that we have removed those 
numbers congruent to h (mod q). Then Z(q, h) is not near Z/q as 
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would normally be the case, but instead Z(q, h) = 0. If this occurs 
for many h (mod q), then V(q) is large, and if V(p) is large for many 
primes p, then Z is small. More specifically, we have 

THEOREM 3. Let. beaset of Z integers in the interval M + 1 
<n<M+4+N. Let P bea set of P prime numbers p, with p < Q for 
all peé A. Let 0 <t < 1, and suppose that Z(p, h) = 0 for at least tp 
values of h(mod p), for all pe Y. Then 


N + 3Q? 


Z< 
a tP 


To see this we note that if p ¢ Y, then V(p) > tp(Z/p)’, so that by 
(9), 
tPZ? <(N + 3Q2)Z. 


This gives the desired bound. 

To appreciate the strength of this bound, suppose that is the 
set of squares in the interval 1 <n < N, let Q = N?, and let ” be the 
set of odd primes p < N?. Then Z(p, h) = 0 for quadratic non- 
residues h (mod p), so that Z(p, h) = 0 for at least 4(p — 1) values 
of h. Hence t = 4 and P ~ 2N?/log N, and we obtain the bound 
Z < N* log N, which is not far from the truth, Z ~ N?. 

To derive (9) from (8) we used only prime moduli. By taking more 
care we can use composite moduli as well, and thus obtain a sharper 
bound. This was first done by Bombieri and Davenport? in a special 
case, and by Montgomery” in general; the result is that if Wis a 
set of Z members in the interval M+1<n<M+N, and if 
@(p) is the number of h (mod p) for which Z(p, h) = 0, then 


2 
gee. 
L 


where 


L=)v@II a 


qa<Q plaP — op) 


The large sieve, in the form of inequality (8), is useful also in 
estimating averages of character sums, as was first observed by 
Rényi’°. Gallagher!’ found the following elegant formulation. 


8 Abh. aus Zahlentheorie und Analysis zur Erinnerung an Edmund Landau, VEB 
Deutsch. Verlag Wiss., Berlin, 1968, 11-22. 

° J. London Math. Soc., 43, 93-98 (1968). 

10 Izy, Akad. Nauk SSSR Ser. Mat., 12, 57-78 (1948); Amer. Math. Soc. Transl., 
(2) 19, 299-321 (1962). 

11 Mathematika, 14, 14-20 (1967). 
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THEOREM 4. Let x be a character (mod q), and put T(y) = 
yt anx(n). Then for any Q’> 1, 


M+N 


yy »* ITP < (N + 397) ee |a,|?. 
qa<Q uO ee 


Here ) * denotes a sum over all primitive characters x (mod 4). 
It suffices to show that 
a 
q 


q 
(10) riTye< ¥ 
x q a=1 
(a,q)=1 
for then the result follows from (8). To establish (10), we recall fron 
§9 that if y is primitive (mod : then 


x(n) = => y to " 


for all n. On multiplying both sides by a, and summing, we see that 


2 


70) =z => aos: ) 


As |t(%)| = q? for primitive x, we find that 
| Ae 7) Ge 
*ITW? =-d*| ¥ x@s (; 
x q xX a=1 q 


The right-hand side is increased if we drop the condition that y be 
primitive, and 


E| Eaos(i)| = : E(t) ch ox 
= @ ¥ |s ; 


1 


(a.g=1 


Thus we have (10), and the proof is complete. 
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BOMBIERI’S THEOREM 


Rényi used the large sieve to show that prime numbers are well 
distributed in arithmetic progressions (mod q) for most q; his 
rather complicated result allowed him to show that every large even 
number is representable in the form 


Pp + PiP2---Pr> 


where r is bounded by some absolute constant. The subsequent 
refinements of Bombieri! and A. I. Vinogradov? enable one to take 
r = 3, and recently Chen? has added an ingenious new idea to 
obtain r = 2. 
We now develop Bombieri’s elegant estimate, without pursuing 
its applications. For brevity we put 
E(x; qa) = W(x; q, a) — 


#(4) 


for (a, q) = 1, we let 
E(x; q) = max |E(x;q,a)|, 
(e.0)= 1 
and 
E*(x, g) = max E(y, q). 


y<x 
We prove that E*(x, q) is significantly smaller than x/(q) for most 
q < x*(log x)~“. 
THEOREM. Let A > 0 be fixed. Then 
(1) > E*(x, q) « x?Q(log x)° 
q<Q 


provided that x*(log x)~4 < Q < x?. 


1 Mathematika, 12, 201-225 (1965). 

? Izv. Akad. Nauk SSSR Ser. Mat., 29, 903-934 (1965); 30, 719-720 (1966). 

3 Sci. Sinica, 16, 157-176 (1973); see also Chapter 11 of Halberstam and Richert, 
Sieve Methods, Academic Press, London (1974). 
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To assess the strength of this bound we note that there are at most 
y/qg +1 integers n < y, n= a(modq), and hence (y;q,a) < 
xq! log xforg < x,y < x,so that E*(x, q) < xq‘ log xforg < x. 
Consequently the bound 


¥ E*(x, q) < x(log x)? 

q<Q 
is trivial for Q < x. On the other hand, from (1) we see that if 
Q = x*(log x)~78~°, then 


W(x; q, a) = —~(1 + O((log x)~*)) 


a) ) 
for all reduced residue classes a (mod gq), and for all g < Q with the 
possible exception of at most Q(log x)~? values of q. 
Halberstam has conjectured that 
» E*(x,q) < x(log x)“ 
q <x inm€é€ 

for any fixed positive A and ¢; such a strengthening of Bombieri’s 
theorem would have important consequences. 


Our proof of the theorem falls in two parts. First we use our esti- 
mates of §22 to show that the theorem follows from the bound 


2) >» dq Hp VIVO. | «(x + x*#Q + x*Q”) (log Qx)*, 
q<Q ysx 
which is valid for all x > 1, Q > 1. Then we establish (2) by com- 
bining the large sieve with the method of §24. 
We recall that 


Wy; q, a) = >» Lay, 2. 


xa 


From W(y, %¥o) we wish to subtract the main term y; accordingly we 
put 


— WO, if y # Xo» 
WD = es to) —y fx =X. 
Then 
oy 
Wy; q, a) eG rn A” — ¥ avy, 0, 
and hence 
|E(y; q,a)| << —~> |W, vI- 


ro 
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As this estimate is independent of a, we see that E(y; q) satisfies the 
same bound. If y (mod q) is induced by y,(mod q,) then w’(y, y) and 
W'(y, 71) are nearly equal, for 


WO. 4) — WO. 0 = » 11(p)log p 


pla 


< log 
2 lces 4 , 


< (log y) )| log p < (log qy)”. 
P\q@ 


Hence 


E(y, 4) < (og gy)? + —~ YW, a), 


Xo ee 
and thus 


E*(x, q) < (log qx)? + —~ ¥\ max|W'(y, x1) I. 


xa yx 


We now combine all contributions made by an individual primitive 
character. A primitive character x (mod q) induces characters to 
moduli which are multiples of q; hence the left-hand side of (1) is 


1 
: (kq)) 
< O(log Ox) + Y Ut maxly'G, an 3, 7) 


Here the first term is ete As for the second term, we note that 


p(kq) = (k)P(q), so that 
1 1 1 


Moreover, 


< log z. 


Hence the second term above is 


<(log x) 2 d* max|W'(y, HI, 


rc Dy ysx 
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and so it suffices to show that 


3) y a * max|W'(y, | < x*Q(log x)4 
q<Q ysx 


for x#(log x)~4 < Q < x*. We now consider large and small values 
of g separately. From (2) we see that 


iti OO) oe a pes IWO, DI < <(5 + xt + x? ‘v)Qoe Ux)*. 


By summing this over U = 2* for an appropriate range of k, we see 
that 
=~ »* max |W(y, OI < (* + x*logQ + s)oe Qx)*. 
1 


Qi1<q<Q Ka a y<sx 


This is acceptable in (3) if Q, = (log x)“. If y is a primitive character 
(mod q), q < (log x)4, y < x, then by estimate (3) of §22, 


W'(Y, 1) < x(log x)" *4, 


and hence the contribution of q < (log x)“ in (3) is <x(log x)~4,_ 
which is also acceptable. Thus the theorem follows from (2). 

We now prove the estimate (2). In §24 we observed that our method 
of estimating )),<y f(n)A(n) fails if fis multiplicative; in particular 
we are not able to bound w(x, x) by this method. Nevertheless we 
can use the method to bound an average of | (x, y)| over various x, 
by using the large sieve. More precisely, we use the large sieve in the 
form of the inequality 


M+N M+N 
4 * 2 2 
(4) », mae \. d Ay, rol <(N+Q ) da : 
to show that 
(5) ) jou dy,b, x(n) 
qsQ $(q) Xx u l<m<M 1<n<N 


4 + 
<(M + Q*)*(N + o( > aa ( ,°) log 2MN; 
1<m<M 1<n<N 


we use this in the method of §24. 
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To derive (5) we first note that by (4) and Cauchy’s inequality, 


2 ae $(q) FZ Jb s = ? nx(mnn) 


ay + aS 
<(Z ae |2aml) (3 ata? 
(6) <(M + Q7)3(N + or ¥ > ae Py (x |b, A) 


m=1 


am b, x(n) 


n= 


: 


To introduce the condition mn < u we appeal to the Lemma of §17 
(with c tending to 0), from which we see that if T > 0, B > 0, and « 
is real, then 


i pita sin B i 7 e + O(T-'(B = |a})7~*) if |x| < B, 


O(T~ *(\a| — B)~*) if |a| > B. 
Putting 6 = log u, we find that 


_T t 
sin(t log u) ai 
nt 


ni} 
’ 


log il 
u 


N T 
ys Yam Dy x(n) = | ; ae x) Bt, x) 


+ (7-5 lagby 


where 
N . 
A(t, X) = ny Anx(mym=", BC, x) = Yb, x(n)n~*. 
m= n=1 


Without loss of generality we may assume that u is of the form 
u =k + 5, where k is an integer, 0 < k < MN. Then 


1 
MN’ 


mn 1 
log —|>-> 
u u 


and 
sin(t log u) < min(1, |t| log 2MN), 


so that the right-hand side above is 


: MN 
< ee y) Bt, y)|min a log 2MN } at + a Le 14m Pal: 
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We now apply (6) to the first term, and Cauchy’s inequality to the 
second, in order to see that the left-hand side of (5) is 


M cy 
<(M + O(N + o( Y lan?) 


«(z6) | min(| 


+ M?N*Q?T- p | Py (x Ib, Be 


With T = (MN)?, (5) now follows. 

If Q? > x, then (2) follows from (5) on taking M = 1, a, = 1, 
b, = A(n), N = x. We now assume that Q? < x, and prove (2) 
using the identity of §24. We have 


Wy, X) = S, + S82 + $3 + Sq, 


where 

(7) S,= 2 A@x(n) < U, 

(8) S=- } (x mann) » xr), 

t<UV \t=md r<y/t 
Uev 

(9) S3 < (log y) )) max x(h) 
d<V w wsh<y/d 

and 


(10) S= Y. Am Y ( y md at 


U<m<y/V V <k<y/m\ d|k 
a<V 


Here y depends on x, y < x, but we shall choose U and V later as 
functions of Q and x only. 
To treat S, we first note that by (5), 


yay max] Yo Atm) Y (> a) | 


q<Q io ysx 


U<msy/V V <k<y/m\ d{k 
M<m<2M d<V 
2 tln2 . ~ aie 2 : 2 J 
<(Q? + M)*(Q? + —] [ ¥ An) » ak)’ ) log x 
M M k<x/M 


(11) <(Q?x* + QxM~* + Qx*M? + x)(log x)*. 
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Here we have used the elementary estimates for }\ A(m)* and 
Y d(k)? which we proved in §24. We sum (11) over M = 2* for 
4U < 2* < x/V, and thus find that 


(12) Bq Smale 


< (Q?x* + OxU~* + OxV™? 4+ x)(log x)*. 


To treat S, we consider two ranges of t, by writing 


Se) =) 4 YS S74 8. 


t<UV t<U U<t<UV 


We deal with S> exactly as we did with S,, and we find that 


(3) a Li mani 


< (Q?x* + QxU~* + Qx*?U?V? 4+ x)(log x)’. 
On the other hand, 


< (log U) > 


t<U 


Y xr)}, 


rs<y/t 


and by the Polya—Vinogradov inequality of §23 we see that 
S, < q?U(log qu)’ 


uniformly for y < x. However, this applies only when gq > 1; for 
q = 1 we have the trivial bound 


S5 < x(log xU)?. 


On combining these estimates we find that 


(14) > > * max|S| < (Q?U + x)(log Ux)?. 
q<Q uo yy ysx 
We treat S; as we did S}, and find that 
(15) y * max|S3;| <(Q?V + x)(log Vx)’. 


a= WO )e ysx 
On combining estimates (7), (12)-(15), we see that the left-hand 
side of (2) is 
<(Q?x? +x+ QxU-* + OxV7?* + U?ViQx? 
+ Q?U + Q?V) (log xUV)*. 


If we allow U and V to vary in such a way that the product UV is 
fixed, we see that the above is minimized by taking U = V. If 
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x* <Q <x’, then the terms involving U are minimized by taking 
U = x*Q71, and then their contribution is 


<Q?x* « Q?x?. 


If 1 < Q < x’, then the terms involving U are minimized by taking 
U = x‘, and then their contribution is <x*Q. Hence we have (2), 
and the proof is complete. 

We have followed here Vaughan’s proof* of Bombieri’s theorem, 
which differs significantly from the approach used previously by 
Rényi and Bombieri. They used the large sieve to estimate the 
number of zeros of L functions in various rectangles, and then they 
derived an estimate corresponding to (2) by means of the explicit 
formulae of §19. Let N(o, T, y) denote the number of zeros p of 
L(s, x) in the rectangle o < B < 1,|y| < T. Bombieri? proved that 


y \'* N(o, T, x) < T(Q? ae OT)* — 3 — 260199 QT)'°; 

q<Q x 
this was subsequently improved by Montgomery?® (see also Bom- 
bieri’). Gallagher® proved Bombieri’s theorem without discussing 
zeros, by applying the Mellin transform to the identity 


L L’ 
So ee eas, | eS 1 a? 2: , t 2. 
L L (1 — LG)* — 2L’G + L'LG 
Vaughan? found that it was more efficient to use the identity 
L' L’ 
(16) ~ Gr F- FG ~u6+(-2- Flu 16) 


he showed that 


y —L y* max|Wy, OI « (Q2xt + Qt x? + x) (log Qx)*. 
qa<Q #4) x ysx 


Then Vaughan discovered that the identity (16) could be used to 


provide a new form of Vinogradov’s method; this permitted us to 
derive the sharp estimate (2) by essentially elementary means. 


* To appear in the Turan memorial volume of Acta Arithmetica, 37, 111-115 (1980). 

> Mathematika, 12, 201-225 (1965). 

° Topics in Multiplicative Number Theory, Springer-Verlag, Berlin (1971) Chapter 
12. 

’ Le grand crible dans la théorie analytique des nombres, Astérisque No. 18, Soc. 
Math. France, Paris, 1974. 

8 Mathematika, 15, 1-6 (1968). 

° J. London Math. Soc., (2) 10, 153-162 (1975). 
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AN AVERAGE RESULT 


We now consider the mean square error in the prime number 
theorem for arithmetic progressions. Work in this direction was 
initiated by Barban', and by Davenport and Halberstam?. Their 
results were sharpened by Gallagher*, who showed that 


(1) xy : (v6: q, a) — re fin) < xQ log x 


for x(log x)~4 < o < x; here A > 0 is fixed. This estimate is best 
possible, for Montgomery* has shown that the left-hand side is 
~Qx log x for Q in the stated range. Moreover, Hooley* has shown 
that (1) can be combined with some of Montgomery’s ideas to give, 
in a simple way, a very precise asymptotic estimate. 

The estimate (1) differs from Bombieri’s theorem of §28 in that we 
have a much longer range of q, and we consider a mean over residue 
classes instead of the maximum. We again use the large sieve, but 
now the proof is simpler than in the case of Bombieri’s theorem. In 
fact, by the large sieve in the form of Theorem 4 of §27, with a, = 
A(n), we have 


2) Y A Yrles OP < (x + Q)x log x, 
q<Q wo *Z 

since X,,<,, A(n)* < x log x. We now derive (1) from (2) in much the 

same way that we derived (1) from (2) in the previous section. As in 

that argument, 


(3) W(x; q, a) — » Hayl'(x, 2). 


ee 
oq) $(q) 


1 Dokl. Akad. Nauk UzSSR, 1964, No. 5, 5-7. 

2 Michigan Math. J., 13, 485-489 (1966); 15, 505 (1968). 
3 Mathematika, 14, 14-20 (1967). 

4 Michigan Math. J., 17, 33-39 (1970). 

5 J. Reine Angew. Math., 274/275, 206-223 (1975). 
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We now form the square of the modulus of both sides, and sum over a. 


We expand the right-hand side and take the sum over a inside, to 
see that 


Id xa’, OP? = $@) LIW'G oP 


e073 
since 
q oq) fx: =X, 
» wt(a)X2(a@) = | ; 
fed if x%1 F X2- 
Thus from (3), 
¥ (voiaw~ 35) = gq lear 
(a,q)=1 


As in the previous section, if y is induced by 7,, then 
W'(x, X) = W(x, 41) + O(log gx)’). 


Hence 


q 2 
y (vox a. ad - 3) < (log qx)? + — YW’ 44)? 


roe. 


Here the first term on the right is negligible, so that to prove (1) it 
suffices to show that 


JI? l 
Dic y L1H XI <« xQ log x. 


If x is primitive (mod q), then x induces characters to moduli which 
are multiples of g; hence the left-hand side above is 


d, L*Iwe oP Y 


a 
£201 (kg) 


As in the previous section, the innermost sum is < ¢(q)~ * log(2Q/gq). 
Hence it suffices to show that 


1 
* 2 ] 
(3) Lag ( 2\y Wea 2a0 lees 
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for x(log x)" 4 < Q < x. We consider large and small q separately. 
From (2) we see that 


1 2 
1 * 

petay OD $(q) i 2) d a i? 

<(x?U~! + Ux)(log 2s 2) 


for 1 < U < Q. Summing over U = Q2~*, we find that 
log 2) 5 * lux, oO) 
Q1 xo Ha) $(q) al 2 
< x’Q7'(log x)? + Ox log x. 


This suffices in (3), if x(log x)~* < Q < x and Q, = (log x)**!. By 
estimate (3) of §22, 


W(x, x) < x exp(—c,/log x) 


for q < (log x)**'; hence the contribution of g < Q, in (3) is 


<Q, (log Q)x? exp(—c./log x) < x*(log x)~4 < Qx log x. 
Thus we have established (3), and the proof is complete. 
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REFERENCES TO OTHER WORK 


The principal omission in these lectures has been the lack of any 
account of work on irregularities of distributions, both of the primes 
as a whole and of the primes in the various progressions to the same 
modulus q. 

As regards irregularities in the distribution of the primes as a 
whole, the first point to be noted is that in this connection it 1s no 
longer possible to make inferences from the behavior of w(x) to that 
of x(x). It was proved by E. Schmidt in 1903, by relatively elementary 
arguments, that 


W(x) — x = O,(x?), 


where the notation means that there exist arbitrarily large values of 
x for which 


w(x) — x > ex}, 


where c 1s some positive constant, and other arbitrarily large values 
of x for which 


w(x) — x < — cx}. 


But the analogous problem for z(x) — lix was much more difficult. 
It had been conjectured, on numerical evidence, that n(x) < lix 
for all large x. This was disproved by Littlewood in 1914; he showed, 
in fact, that 


x? log log log x 


mx) -—hx =Q, 


log x 


Littlewood’s proof! was divided into two cases, according as the 
Riemann hypothesis is true or false, the former being the difficult 
case. Owing to its indirect character, the proof did not make it 


"See Ingham, Chap. 5, or Prachar, Chap. 7, §8. 
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possible to name a particular number x, such that 2(x) > lix for 
some x < Xp. It was not until 1955 that such a number was found, 
namely by Skewes”; his number was 10,(3), where 10,(x) = 10*, 
10,(x) = 10'°'™, and so on. 

Questions concerning the irregularity of distribution of the primes, 
as between one residue class to the modulus g and another, have been 
deeply studied in recent papers®? on comparative prime number 
theory, by Turan and Knapowski. It is impossible to give any useful 
account of their work here, but one particular result may be men- 
tioned as a sample. Suppose that, for each character y (mod q), the 
function L(s, y) has no zero in the rectangle 


0O<o<l, \t| < 6. 


Then, if a, # a, (mod q), the difference 
W(x5q, 41) — W(x; q, a2) 


changes sign at least once in every interval 


w<x< exp(2,/), 


provided @ is greater than a certain explicit function of g and 6. 
Some of their results are independent of any such unproved hypo- 
thesis. The work of Turan and Knapowski is based in part on some 
of the methods developed by Turan in his book Eine neue Methode in 
der Analysis und deren Anwendungen (Budapest, 1953). 

The problem of finding an upper bound for the least prime in a 
given arithmetic progression has received a remarkably satisfactory 
solution (considering its inherent difficulty) at the hands of Linnik. 
He proved* that there exists an absolute constant C such that, if 
(a,q) = 1, there is always a prime p = a(mod q) satisfying p < q°. 
The proof is difficult. 

A subject that has attracted attention, but concerning which the 
known results leave much to be desired, is that of the behavior of 
Pn+1 — Pw Where p, denotes the nth prime. As regards a universal 
upper bound for this difference, the first result was found by Hoheisel, 
who proved that there exists a constant a, less than 1, such that 
Pn+1 — Pn = O(p%). The best result so far known is due to Ingham,°* 
who showed that this estimate holds for any a greater than 38/61. 


2 Proc. London Math. Soc., (3)5, 48-69 (1955). 

* The main series consists of eight papers in Acta Math. Hungaricae. 13(1962) and 
14(1963), and a sequel of three papers in Acta Arithmetica 9, 10, 11 (1964-1965), 
together with a paper in J. Analyse Math., 14(1965). 

“See Prachar, Chap. 10. 

5 Quarterly J. of Math., 8, 255-266 (1937). 
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In both cases, what is actually proved is that 


a 


mx + x7) — mx) ~ as X > oO. 


log x 

In a crude sense one can say, in view of the prime number theorem, 
that the average of p,., — Pp, 18 log p,. Erd6s was the first to prove 
that there are infinitely many n for which p,., — p, is appreciably 
greater than log p,, and Rankin® proved that there are infinitely 
many n for which 


(log, p,) (log, Pn) 
(log; Prd 


Pati — Pn > C(lOg Pr) , 
where log, x = loglog x and so on, and c is a positive constant. 
In the opposite direction, Bombieri and | proved recently’ that 
there are infinitely many n for which 


Pn+1 — Pn < (0.46...) log py. 


Of course, if the “prime twins” conjecture is true, there are infinitely 
many n for which p,., — p, = 2. 

There is a somewhat paradoxical situation in connection with the 
limit points of the sequence 


Pn+1 — Pn 
log Pn 


Erdés.and Ricci (independently) have shown that the set of limit 
points has positive Lebesgue measure, and yet no number is known 
for which it can be asserted that it belongs to the set. 

For references to other work in multiplicative number theory, 
one should consult, in the first place, the articles of Bohr and 
Cramer, and of Hua. 


® J. London Math. Soc., 13, 242-247 (1938). 
’ Proc. Royal Soc. (London), A, 293, 1-18 (1966). 
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